iSarabjitDhiman / TweeterPy

TweeterPy is a python library to extract data from Twitter. TweeterPy API lets you scrape data from a user's profile like username, userid, bio, followers/followings list, profile media, tweets, etc.
MIT License
123 stars 17 forks source link

after 50k followers. throwing unrelevant exceptions. #56

Open ilonabehn2 opened 3 months ago

ilonabehn2 commented 3 months ago

i am trying with 1k tokens : https://www.mediafire.com/file/194pr1tulmh11mi/auth_tokens.txt/file after collecting 50k followers. script is throwing this error :

'result'
2024-04-10 22:16:09,401 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:09,402 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:09,755 [←[1;31mERROR←[0m] :: Could not authenticate you

Traceback (most recent call last):
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\tweeterpy-1.0.17-py3.12.egg\tweeterpy\request_util.py", line 32, in make_request
    return util.check_for_errors(response.json())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python312\Lib\site-packages\tweeterpy-1.0.17-py3.12.egg\tweeterpy\util.py", line 102, in check_for_errors
    raise Exception(error_message)
Exception: Could not authenticate you
Could not authenticate you
2024-04-10 22:16:11,907 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:11,907 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:14,830 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:14,830 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:17,727 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:17,728 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:20,649 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:20,650 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:23,425 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:23,425 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:26,227 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:26,228 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:29,057 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:29,057 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:31,852 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:31,852 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:34,730 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:34,731 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:37,505 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:37,506 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:40,274 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:40,275 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:43,043 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:43,044 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:45,944 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:45,945 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:48,811 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:48,811 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:51,673 [←[0;32mINFO←[0m] :: User is authenticated.
2024-04-10 22:16:51,673 [←[0;32mINFO←[0m] :: User is authenticated.
'result'
2024-04-10 22:16:54,464 [←[0;32mINFO←[0m] :: User is authenticated.

this is my full code : 

from tweeterpy import TweeterPy
import sys
from tweeterpy.util import RateLimitError
import itertools
import urllib3
urllib3.disable_warnings()
#import concurrent.futures

list_tokens = open('twitter_tokens.txt', 'r', encoding='utf-8').read().splitlines()
accounts_pool = itertools.cycle(list_tokens)

def get_account():
    return next(accounts_pool).strip()

class Profile():

     def __init__(self,profile):
        self.profile = profile

     def get_followers(self,cursor=None):

        self.profile = self.profile.strip()
        has_more = True
        #cursor = None
        twitter = TweeterPy()
        twitter.generate_session(auth_token=get_account())
        while has_more:
            response = None
            try : 
                response = twitter.get_friends(self.profile,follower=True, end_cursor=cursor,pagination=False)
                with open(self.profile+'.txt', 'a',encoding='utf-8') as save_followers: 
                        for follower in response['data']: 
                            screen_name = follower['content']['itemContent']['user_results']['result']['legacy']['screen_name']
                            save_followers.write(screen_name+'\n')
                api_rate_limits = response.get('api_rate_limit')
                limit_exhausted = api_rate_limits.get('rate_limit_exhausted')
                if has_more:
                    cursor = response.get('cursor_endpoint')

                if limit_exhausted:
                    twitter.generate_session(auth_token=get_account())
                has_more = response.get('has_next_page')
                api_rate_limits = response.get('api_rate_limit')

            except Exception as e : 
                print(e)
                twitter.generate_session(auth_token=get_account())
            ## YOUR CUSTOM CODE HERE (DATA HANDLING, REQUEST DELAYS, SESSION SHUFFLING ETC.)
            ## time.sleep(random.uniform(7,10))

def create_and_launch_threads(profile):
    profile_client = Profile(profile)
    profile_client.get_followers()
    return

create_and_launch_threads('elonmusk')
iSarabjitDhiman commented 3 months ago

Hey @ilonabehn2 What error does it show when u r close to 50K followers? The logs u attached above are related to scraping, I need the logs for the errors. I just noticed, the logs are all over the place, I might remove the unnecessary logs soon.

Edit: The "results" KeyError is because of the rate limits.

iSarabjitDhiman commented 1 month ago

Hey @ilonabehn2 Please use this build I will update the code soon.

Tobitheprof commented 1 month ago

Hello @ilonabehn2, I would like to know how you got the twitter_tokens.tt file. I am a currently doing a study that requires me to analyze some bias and I cannot afford the twitter api. (Thank you guys for the support in advance)

This package has been a big help. Thank you @iSarabjitDhiman