iSarabjitDhiman / TweeterPy

TweeterPy is a python library to extract data from Twitter. TweeterPy API lets you scrape data from a user's profile like username, userid, bio, followers/followings list, profile media, tweets, etc.
MIT License
145 stars 20 forks source link

Issues with Parallelism - Issue with Error Handling #48

Closed MunirG05 closed 6 months ago

MunirG05 commented 8 months ago

I tried everything from asyncio, multiprocessing, threading. This library only works when its run the way in the docs. The following snippet shows my attempt:

from tweeterpy import TweeterPy
import concurrent.futures

fp = open("acc_to_scrape.txt")
lines = fp.readlines()
fp.close()

thisTwitterAPI = TweeterPy()
thisTwitterAPI.login(OMIT)

def fetchFollowers(line):
    thisTwitterResponse = thisTwitterAPI.get_friends(line, follower=True)
    with open(f"./acc_followers/{line}.txt", "w") as fp:
        fp.write(thisTwitterResponse)

with concurrent.futures.ProcessPoolExecutor() as executor:
    executor.map(fetchFollowers, lines)

I keep getting: Unable to fetch data error. Traceback:

Traceback (most recent call last):
  File "/home/OMIT/OMIT/.venv/lib/python3.10/site-packages/tweeterpy/request_util.py", line 32, in make_request
    return util.check_for_errors(response.json())
  File "/home/OMIT/OMIT/.venv/lib/python3.10/site-packages/tweeterpy/util.py", line 105, in check_for_errors
    raise Exception("Couldn't fetch data.")
Exception: Couldn't fetch data.
2024-02-08 02:08:37,180 [ERROR] :: Couldn't fetch data.
iSarabjitDhiman commented 7 months ago

Hey, I will take a look at it this week, you can check the async branch in the meantime. Though I haven't updated it in a while, you can still take a look. I will update you once it's fixed.

iSarabjitDhiman commented 7 months ago

Hey @MunirG05 I just tested the code snipped you shared above and it seems to be working fine for me. I didn't get any errors. Any of the followings might a reason its throwing an error:

If you are still unable to resolve it, just let me know. You can also share a screen recording on discord.

Let me know how it goes.

iSarabjitDhiman commented 6 months ago

Hey @MunirG05 I just encountered the same error a while ago. It was not an issue with the Parallelism, but with the error handling. I checked the traceback in the logs and figured out I made an error in check_for_errors function in utils.py module.

This information might be useful in your case : It was showing me the error because it was unable to find the tweet as the tweet was deleted. While I was checking your code, I noticed that you're reading accounts/tweets from the .txt files and there maybe a chance that there is an unavailable account/tweet in the list. Hope it makes sense.

I have fixed it already and will be committing changes shortly.