innocentius / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
4 stars 0 forks source link

[TODO] A more stable Guest Token retrival method. #1

Open innocentius opened 3 years ago

innocentius commented 3 years ago

During a long duration data retrival session when AWS proxies are being used, guest tokens could expire, leading to a token.refresh . However, since Twitter wouldn't send guest token to AWS IPs, all following requests made after the token expires would fail. There need to be a way to retrive guest token using viable IP (either your own, or another resident proxy route), in order to achieve long term stability.

innocentius commented 3 years ago

Currently in the main branch, the Guest Token is always retrieved without a proxy. Such is that our IP is still exposed to Twitter, which could then detect the frequent guest token request when doing large scale scrapping. By creating a proxy for retrieving guest token, we would be able to hide our IP and retrieve guest token in a more stable fashion.

However, if our proxies are AWS IPs, then this might not actually work well. Maybe we should consider implimenting a seperate proxy method for this.

innocentius commented 3 years ago

A temporary fix is implimented which only support http proxy.

innocentius commented 3 years ago

Scratch that, this fix is not actually working.

innocentius commented 3 years ago

A temporal solution is in place, for now. This significantly decrease the speed of scrapping, due to the need of using proxy for token.

paulowe commented 3 years ago

When I try running twint -u username --followers I get the following. How do I go around this? I simply want to scrape followers

sleeping for 2 secs due to token failure
sleeping for 4 secs due to token failure
sleeping for 6 secs due to token failure
sleeping for 8 secs due to token failure
innocentius commented 3 years ago

Sorry friend, this version of twint is not yet ready for public use. If you are already using proxies, make sure to direct your proxy port to 127.0.0.1:24000, that way you can use it properly. If you are not using proxies, I recommend you use the original version, although I did heard the get follower functions there is also failing due to changes on Twitter's side.