Skipping a block of tweets when using download_tweets.py

sdelgadoc / download-tweets-ai-text-gen-plus

Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation

MIT License

35 stars 8 forks source link

if limit is not None: cursor = tweepy.Cursor(api.search_full_archive, environment_name=environment_name, query = "from:" + username, fromDate="200603220000").items(limit) else: cursor = tweepy.Cursor(api.search_full_archive, environment_name=environment_name, query = "from:" + username, fromDate="200603220000").items()

I'm glad to hear that the code is working with you! As a developer, one always wonders if all those clones are leading to use, or to cursing when things don't work.

You are right that the code, as is, can't limit the collected tweets by date. You also correctly identified the part of the code that could be modified to limit collected tweets by date.

However, it will require adding another parameter to the tweepy.Cursor call named toDate. The code collects tweets from most recent to oldest, so if you change the fromDate, it will still start collecting from the newest tweet versus from where you ended.

To get the behavior you're looking for, you need to add the toDate parameter as shown below, and set it to the date of the last tweet you collected.

if limit is not None:
    cursor = tweepy.Cursor(api.search_full_archive, 
                               environment_name=environment_name,
                               query = "from:" + username,
                               fromDate="200603220000",
                   toDate="[DATE_OF_LAST_TWEET]").items(limit)
else:
    cursor = tweepy.Cursor(api.search_full_archive,
                               environment_name=environment_name,
                               query = "from:" + username,
                               fromDate="200603220000",
                   toDate="[DATE_OF_LAST_TWEET]").items()

Although the answer above should resolve your issue, if you're feeling generous, you could make the toDate value a parameter to the code by sending it up the download_account_tweets and download_tweets functions. :-) If you do, please submit a pull request; I'd be happy to add it to the code base.

sdelgadoc / download-tweets-ai-text-gen-plus

Skipping a block of tweets when using download_tweets.py #11