minimaxir / download-tweets-ai-text-gen

Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation.
MIT License
219 stars 41 forks source link

IndexError: list index out of range #28

Closed DeFiDude closed 4 years ago

DeFiDude commented 4 years ago

Python version: 3.8.3 Using the latest download_tweets.py file

When attempting to retrieve tweets from a specific user it returns "IndexError: list index out of range" When attempting to retrieve tweets from multiple users in a .txt file it returns "IndexError: list index out of range"

My usage:

python download_tweets.py USERNAME
python download_tweets.py TWEETLIST.txt

I've used this same exact script (albeit the older version) with absolutely no problem. I even attempted to extract tweets using the previous working version and received the same index out of range issue.

DeFiDude commented 4 years ago

@sdelgadoc

sdelgadoc commented 4 years ago

It looks like you're hitting the same issue as https://github.com/minimaxir/download-tweets-ai-text-gen/issues/26, and https://github.com/minimaxir/download-tweets-ai-text-gen/issues/27,

The issue appears to be caused by the twint.run.Lookup command in line 70 not retrieving the number of tweets for the username. I wasn't able to get the command to work correctly quickly, so I'll have to put some work into it. But, I have a workaround for you.

Run the code and specify the tweet limit in the command line so you can skip the command that isn't working:

python download_tweets.py USERNAME 1000

Also, use the code in pull request https://github.com/minimaxir/download-tweets-ai-text-gen/pull/24, which you can clone from https://github.com/sdelgadoc/download-tweets-ai-text-gen because it fixes other issues currently in the master.

DeFiDude commented 4 years ago

Thanks for the quick response!

I used the workaround and it fixed the list out of range file, but I noticed the code in pull request #24 didn't allow me to use a .txt file for multiple users, so when I specified the .txt file (e.g. tweetlist.txt) it attempted to download tweets from "@tweetlist.txt".

I tried to use a previous version from April that I had used with a limit adding and it seemed to bypass the list out of range error again, though now it begins to download tweets for the first user at an extremely slow rate took like 1-2 minutes to gets 20 tweets.

Is there a version similar to #24 that allows downloading from multiple users?

sdelgadoc commented 4 years ago

It looks like you are experiencing the same issue as https://github.com/minimaxir/download-tweets-ai-text-gen/issues/22.

The multiple username download functionality was removed from the master. I'm still happy to bring it back if folks find it useful.

For now, the following repo has all the fixes in pull request https://github.com/minimaxir/download-tweets-ai-text-gen/pull/24 and multiple username download functionality.

https://github.com/sdelgadoc/download-tweets-ai-text-gen-plus

sdelgadoc commented 4 years ago

The twint.run.Lookup command appears to not be working in multiple versions of the twint library.

So, I removed the command from the code, which only removed the ability to approximate the total number of tweets for a username in the progress bar. You can still collect tweets just as you did before.

You can find the updated code, which doesn't require the workaround described above, in the repo below:

https://github.com/sdelgadoc/download-tweets-ai-text-gen-plus

DeFiDude commented 4 years ago

The twint.run.Lookup command appears to not be working in multiple versions of the twint library.

So, I removed the command from the code, which only removed the ability to approximate the total number of tweets for a username in the progress bar. You can still collect tweets just as you did before.

You can find the updated code, which doesn't require the workaround described above, in the repo below:

https://github.com/sdelgadoc/download-tweets-ai-text-gen-plus

Great, thanks a bunch. I haven’t tested it yet but will do so at my earliest convenience and I’ll close the issue assuming it works as expected.

Appreciate the help!

sdelgadoc commented 3 years ago

For future reference, this workaround no longer works. At this time, the most reliable way to collect tweets to train an AI/ML model is to follow the steps in the following repo.

https://github.com/sdelgadoc/download-tweets-ai-text-gen-plus