Closed youssefavx closed 4 years ago
You can just specify a config.Resume
file so if Twint crashes, it can restart from where it stopped
Otherwise you have to handle the exception with a try/except
. In the Except condition, you have to specify the correct error raised and then add an input()
statement to wait until the user presses enter (for example)
Thank you! I tried the config.Resume on a followers CSV file though and it kept overwriting what seems to be 20 users everytime it downloads.
This was the code I used for the function:
def downloadfollowers(usource):
print("Downloading followers for " + str(usource))
x = twint.Config()
x.Username = str(usource.lower())
x.Store_object = True
x.Store_csv = True
x.Resume = "test/" + str(usource) + " followers.csv"
#x.Output = str(usource) + " followers.csv"
x.Output = "test/" + str(usource) + " followers.csv"
twint.run.Followers(x)
I guess the Resume file has to be separate?
Edit: I just made a separate resume file and it seems to be working great! Thanks again! Hopefully, this takes care of most situations.
config.Resume gets overwritten
I see, thanks!
I noticed that, when attempting to test it and deliberately disconnecting the internet, it sometimes downloads duplicate tweets or duplicate users. I don't mind the users, but the tweets being duplicated affects some areas of my script.
Is there a way to prevent duplicate tweets?
This is my code:
def downloadtweets(usertweets):
print("Downloading tweets for: " + str(usertweets))
dt = twint.Config()
dt.Username = str(usertweets.lower())
dt.Store_csv = True
dt.Resume = "test/" + str(usertweets) + " resume tweets.csv"
dt.Output = "test/" + str(usertweets) + " tweets.csv"
while True:
try:
twint.run.Search(dt)
break
except aiohttp.ClientConnectorError:
time.sleep(1)
print('Client error. Restarting...')
Now you should not get duplicated tweets while resuming, please retry (you have to either clone the repo or install via pip+git, I might push the update soon)
I'll give it a shot and report back!
I edited my run.py file to match the edits you made.
Okay, I tried it and it seems to work great for Followers, Following, and Search. However, and I don't know if this is related to that change in 'run.py', now when I try to download favorites via the terminal or via the script, I get this error: CRITICAL:root:twint.feed:Mobile:list index out of range
This is while my internet is connected.
Edit: I tried to undo the changes you made to run.py, and now the favorites run fine (but of course the duplicates still occur in the others), so I assume it's related.
I made that edit for url.py along with run.py. Works great now!
CRITICAL:root:twint.get:User:[Errno 54] Connection reset by peer twint.run.Search(z) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/run.py", line 292, in Search run(config, callback) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/run.py", line 213, in run get_event_loop().run_until_complete(Twint(config).main(callback)) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete return future.result() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/run.py", line 154, in main await task File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/run.py", line 198, in run await self.tweets() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/run.py", line 137, in tweets await self.Feed() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/run.py", line 57, in Feed response = await get.RequestUrl(self.config, self.init, headers=[("User-Agent", self.user_agent)]) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/get.py", line 107, in RequestUrl response = await Request(_url, params=params, connector=_connector, headers=headers) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/get.py", line 157, in Request return await Response(session, url, params) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/twint/get.py", line 162, in Response async with session.get(url, ssl=False, params=params, proxy=httpproxy) as response: File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client.py", line 1005, in __aenter__ self._resp = await self._coro File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client.py", line 497, in _request await resp.start(conn) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 844, in start message, payload = await self._protocol.read() # type: ignore # noqa File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/streams.py", line 588, in read await self._waiter aiohttp.client_exceptions.ServerDisconnectedError: None
Description of Issue
Environment Details