Rishikant181 / Rettiwt-API

A CLI tool and an API for fetching data from Twitter for free!
https://rishikant181.github.io/Rettiwt-API/
MIT License
303 stars 31 forks source link

Delay in Fetching Tweets #557

Open nfadeluca opened 1 week ago

nfadeluca commented 1 week ago

I developed a program that creates several processes, each utilizing its own API account. These processes run simultaneously, scanning for tweets from a single account every 60 seconds using rettiwt.tweet.search(filter). To prevent rate limiting, each of the six processes scans the same profile, staggered by 10 seconds, effectively simulating scanning every 10 seconds.

However, when testing, it seems that the api(s) only recognizing when an account has posted ~ a full minute after it made the post. Is this normal for the tweet.search() function? Or should it pick up on new tweets almost instantly?

For the purpose of my program, it is important to pickup on a newly posted tweet from a specific account almost instantly.

Rishikant181 commented 1 week ago

rettiwt.tweet.search should indeed return result instantaneously, that is, if I someone makes a tweet, and soon after I run rettiwt.tweet.search, I should get the newly created tweet. Can you share the code you use to simulate scans every 10 seconds?

nfadeluca commented 1 week ago

Here's the abridged version of the code:

main.ts

const API_KEYS = [
... 6 api rettiwt api keys here ...
];

// pass username to scan as argument
const profile = process.argv[2];
if (!profile) {
   console.error('Please provide a profile name.');
   process.exit(1);
}

if (cluster.isPrimary) {
   const numCPUs = os.cpus().length;
   const numWorkers = Math.min(API_KEYS.length, numCPUs);

   // 6 worker processes
   for (let i = 0; i < numWorkers; i++) {
      setTimeout(() => {
         const worker = cluster.fork();
         worker.send({ profile, apiKey: API_KEYS[i] });
      }, i * 10000); // 10-second offset
   }
   // exit all worker processes if one fails
   cluster.on('exit', (worker, code, signal) => {
      console.log(`Worker ${worker.process.pid} died`);
      cluster.fork();
   });
} else {
   process.on('message', (msg: unknown) => {
      if (typeof msg === 'object' && msg !== null && 'profile' in msg && 'apiKey' in msg) {
         const { profile, apiKey, } = msg as { profile: string; apiKey: string; };
         import('./worker').then(workerModule => {
            workerModule.startWorker(profile, apiKey);
         });
      }
   });
}

worker.ts

// scan tweets, if tweet matching regex found, execute sendTM(client)
async function scanTweets(username: string, client: tc) {
      try {
         const rettiwt = new Rettiwt({ apiKey });

         const startTime = Date.now();

         const filter = new TweetFilter({
            fromUsers: [username]
         });

         const data: CursoredData<Tweet> = await rettiwt.tweet.search(filter);

         if (data && data.list && data.list.length > 0) {
            for (let tweet of data.list) {
               const tweetText = tweet.fullText;
               const matches = tweetText.match(/\b\w{32,44}\b/);
               if (matches) {
                  for (let match of matches) {
                     const matchTime = Date.now();
                     console.log(`Found match: ${match}`);
                     await sendTM(client);
                     // this usually returns ~350ms
                     console.log(`Processed match in ${Date.now() - matchTime}ms`);
                     process.exit(0);
                  }
               }
            }
         }

      } catch (error) {
         console.error('Error scanning tweets:', error);
      }
   }

while (true) {
  // scan every 60s to avoid rate limiting
  const currentTime = new Date().toLocaleTimeString([], { hour: 'numeric', minute: '2-digit', second: '2-digit' });
  console.log(`[${currentTime}] Scanning tweets for profile ${profile} (Process ID: ${process.pid})`);
  await scanTweets(profile, client);
  await sleep(60000);
}

Just to clarify, the >60s delay is not whilst calling the search function itself; once I make a test tweet, the next five or six or so times the search function is called, it will return no matching tweets. After about a minute or so after posting, the search function returns the matching tweet, but I assume the api should pick up on the new tweet far quicker than that.

Rishikant181 commented 1 week ago

Did you try using the rettiwt.tweet.stream method? That method streams tweet in pseudo-realtime, given a scan delay.

Here is the documentation regarding it's usage.

nfadeluca commented 1 week ago

Just tried that, ran into the same issue unfortunately.

Also decided to isolate the search function and test that way

import { Rettiwt } from 'rettiwt-api';
import { TweetFilter } from 'rettiwt-core';
import { Tweet } from 'rettiwt-api/dist/models/data/Tweet';
import { CursoredData } from 'rettiwt-api/dist/models/data/CursoredData';

async function scanTweets(username: string, apiKey: string) {
   try {
      const rettiwt = new Rettiwt({ apiKey });

      const filter = new TweetFilter({
         fromUsers: [username]
      });

      const data: CursoredData<Tweet> = await rettiwt.tweet.search(filter);

      if (data && data.list && data.list.length > 0) {
         for (let tweet of data.list) {
            const tweetText = tweet.fullText;
            const matches = tweetText.match(/\b\w{32,44}\b/);
            if (matches) {
               console.log("Found");
               return;
            }
         }
      } else {
         console.log("No tweets found.");
      }

   } catch (error) {
      console.error('Error scanning tweets:', error);
   }
}

const apiKey = 'api key here';
const username = 'username here';

scanTweets(username, apiKey);

As expected, when manually running the function some times after posting the tweet, it only prints found after about 60 seconds, (five or 6 runs of the program above).

Rishikant181 commented 4 days ago

Strange, it should be returning as soon as a scan is done.

Does it happen with a specific user in fromUsers or does it happen with all target users?

nfadeluca commented 2 days ago

The same behavior occurs regardless of the profile username being passed (I've tested with a few different profiles).