Open ahmed991 opened 3 years ago
Same here. Since it's a scraper, I'm used to it not getting a lot of old tweets, but today I'm not getting tweets before the last seven days. When I try to use the since/until commands, it only gets a few tweets from teh same day. I'm wondering if Twint started collecting through the REST API, wich has a limit of the last seven days.
Same here. twint only collecting less than 100 tweets only.
I'm finding the same problem
I am also having the same issue today
+1
I can get tweets prior to Aug 22, but only 1-2 pages of results, and occasionally (~60%) it will return no tweets.
I can get tweets prior to Aug 22, but only 1-2 pages of results, and occasionally (~60%) it will return no tweets.
It seems that when the search query has only a few tweets, it can overcome the date limit.
I am having the same issue today :(
Same issue :(
I'm having the same problem, except not just when looking for specific dates. The number of tweets I get is inconsistent and sometimes zero. I have implemented the changes committed in #684 but that has not resolved the problem. I'm not very proficient with python but it seems that these changes are still pointing to the exception unconditionally when the data returned is zero. Is there a way to change this?
same issue
Same issue :(
There is a workaround, but it has a limitation to 20 tweets, at least for me. It works to retrieve tweets beyond 22nd of August, but you have to set a small interval for 'c.Since' and 'c.Until'.
e.g.: c.Since = '2021-03-21' c.Until = '2021-03-22'
Be aware that even with this one, it fails somethimes. If you set 'c.Pandas' to True, you could check if your dataframe is empty and if so, run again the configuration (twint.run.Search(c))
Ok guys. Just uncomment line 92 in the url.py file:
cahnge to ('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
Is there a way to delete previous comments. It's a bit messy.
Here again:
Just uncomment (remove the '#') line 92 in the url.py file:
('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
Working for me. Thanks to @klojohn
Is there a way to delete previous comments. It's a bit messy.
Here again:
Just uncomment (remove the '#') line 92 in the url.py file:
('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
thanks for the solution @klojohn . But it does not seem to be working for windows.
I'm having the same issue, not able to scrape the data using since and until.
Is the given solution not working for anyone else too? On linux
@klojohn Great solution! Initially this is working for me on Mac OS.
Hi all, this seems to be an issue around specific dates and/or tweet, but I cant confirm as the process will stop at random points for each run.
If I note the date where it stopped previously and then rerun the process with - - until
Or
c.until
In the solution provided (to comment out line 92) I tested in a few environments:
Hey Guys, I tried uncommenting line 92 from url.py but still no success. I tried on Jupyter and still received only handful of tweets and all tweets were dated 2010-12-04.
@aarorauark How did you install twint? If you used pip, type "pip3 show twint" into the command line and follow the path shown under "Location". There you'll find a folder named twint and the url.py which you have to modify inside that folder.
Thank you @JWLMSN for getting back to me. I used both git and pip as mentioned in the link (https://github.com/twintproject/twint) and tried twint but faced the similar issue. Could you run in the CLI (twint -s "American Airlines" --since "2010-01-02" --until "2010-12-06" -0 "Test_file.csv" --csv) or run in the Jupyter the commands mentioned in my earlier post (snapshot from jupyter has the commands) and let me know if you are able to fetch all the tweets for the range? There is another issue I have opened in which twint is not returning more than 20 tweets and all tweets happened to be from the same day but also not the full set is returned? (https://github.com/twintproject/twint/issues/1276)
@aarorauark I just tried a run with the parameters you mentioned and the query returns way more data beyond 2010-12-04, although I aborted the script because that would be a lot of data to pull for testing purposes. My last couple of responses were
10414711921180672 2010-12-02 20:27:15 +0200 <farecomparedeal> Sales for winter/spring from @VirginAmerica @AmericanAir & more. It's Airfare Deals Round-Up Time http://bit.ly/e90Ukl
10411604629790720 2010-12-02 20:14:54 +0200 <asperkourt> Asper Kourt will be flying first class on American Airlines for the next 3 months . . . k, that's not quite true,... http://fb.me/MPtFbjZ4
but my guess is it would have run all the way until the specified end date. So it's pretty safe to say your specific query is not the problem. Must be something else.
Thank you @JWLMSN for your time. Twint actually starts 2 days prior to the until date you specify thats what i have noticed. I have collected lots of data back in March this year and pretty big files but somehow it is broken now. Could you please share the file because ideally it would not take more than max 10 min to be honest and with this time range of just couple of months it would take only 5 min? I just want to see - (1) you are getting more than 40 odd tweets and (2) you are able to capture most of the dates because what i am seeing is if you do not specify "until" and for less famous companies or less viral search strings twint fetches data for the past 15 days only from now.
You can simply run for a month only of any year and for any company say "Facebook or Amazon" that has large user generated content on twitter. I just want to see two points that I have mentioned.
Again highly appreciate your time on this.
I am also having the same problem. I work with Command Prompt (CMD), where I indicate my command: twint -u gofundme
but it only allows me to extract the tweets until September 15th. How can I solve that?
Hey @DavidPerea. I'm not sure the actual solution but what you can do (which I do) is to try the initial scrape Then run the cmd line but with - - until 2021-09-14 23:55:00
See if that works or try change the time to a few hours earlier
Hi @Slyth3 I have tried testing various dates but when I try to extract tweets after mid-September it tells me the following:
[!] No more data! Scraping will stop now. found 0 deleted tweets in this search.
When if there are more previous tweets. Why does this happen?
Yep. Having the same issue.
Working for me. Thanks very to @klojohn
@klojohn 's solution works for me on mac, thankyou!
Ok guys. Just uncomment line 92 in the url.py file:
('query_source', 'typed_query'),
cahnge to ('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
This worked for me on Windows! Thanks!
I went to try @klojohn 's solution, but that line had already been uncommented in my version of Twint. And I'm still experiencing an issue. I'm on Linux. Did anyone else see that in their version it was already uncommented?
Ok guys. Just uncomment line 92 in the url.py file:
('query_source', 'typed_query'),
cahnge to ('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
I work with Command Prompt (CMD), where I indicate, for example, my command: twint -u gofundme
How can I apply the solution you indicate?
Ok guys. Just uncomment line 92 in the url.py file:
('query_source', 'typed_query'),
cahnge to ('query_source', 'typed_query'), This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
This worked for me on Windows! Thanks!
Not working for me, I am using windows
Is there a way to delete previous comments. It's a bit messy.
Here again:
Just uncomment (remove the '#') line 92 in the url.py file:
('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
Your solution worked for me as well.
I am running twint version 2.1.21 on Python 3.9.7, which is the latest version available via pip.
Now I am wondering: is there planned fix for this in the main release? I guess nothing has happened with this issue yet since twint hasn't been updated on GitHub in a while.
Is there an actively maintained fork of twint somewhere (which preferably includes this fix)? If twint is no longer actively maintained, are there any alternative software we should be aware of?
FYI: I'm running these instructions:
c = twint.Config()
#Represented command: twint -u USERNAME --images -o USERNAME.csv --csv
c.Username = "username”
c.Images = True
c.Store_csv = True
c.Output = "%s.csv" % username
twint.run.Search(c)
For a equivalent project, try snscrape :
https://github.com/JustAnotherArchivist/snscrape
Le dim. 14 nov. 2021 à 13:02, 7k50 @.***> a écrit :
Is there a way to delete previous comments. It's a bit messy.
Here again:
Just uncomment (remove the '#') line 92 in the url.py file:
('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
Your solution worked for me as well.
I am running twint version 2.1.21 on Python 3.9.7, which is the latest version available via pip.
Now I am wondering: is there planned fix for this in the main release? I guess nothing has happened with this issue yet since twint hasn't been updated on GitHub in a while.
Is there an actively maintained fork of twint somewhere (which preferably includes this fix)? If twint is no longer actively maintained, are there any alternative software we should be aware of?
FYI: I'm running these instructions:
c = twint.Config()
Represented command: twint -u USERNAME --images -o USERNAME.csv --csv
c.Username = "username”
c.Images = True
c.Store_csv = True
c.Output = "%s.csv" % username
twint.run.Search(c)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/twintproject/twint/issues/1266#issuecomment-968277164, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBIGXQ366I5WHG5U3XRIL3UL6QNPANCNFSM5DCCJCJQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
still having the same issue on Linux even after trying this solution...in my case, now twint only returns ~90 tweets about "apple" and "$aapl" for one date...
Ok guys. Just uncomment line 92 in the url.py file:
('query_source', 'typed_query'),
cahnge to ('query_source', 'typed_query'), This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
This worked for me on Windows! Thanks!
Not working for me, I am using windows
Just uncomment (remove the '#') line 92 in the url.py file: ('query_source', 'typed_query'), This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
Where do i find/open this url.py file @klojohn? working in google colab used this for installation: !git clone --depth=1 https://github.com/twintproject/twint.git !cd /content/twint && pip3 install . -r requirements.txt !pip3 uninstall aiohttp !pip3 install aiohttp==3.7.0 import twint import nest_asyncio nest_asyncio.apply()
Ok guys. Just uncomment line 92 in the url.py file:
('query_source', 'typed_query'),
cahnge to ('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
Thanks a lot solved for me.
Hey Millet, url.py'den 92. bilgiyi yorumlamayı test ettim ama yine de başarılı olamadım. Jupyter'da denedim ve hala sadece bir avuç 2010 tweet aldım ve tüm tweetler-12-04.
hello, like you, I want to receive tweets with certain hashtags with jupyter notebook, when I do the same commands in jupyternotebook, I get an error. Did you use anaconda 3.6 version, I wonder if that's why mine doesn't work. I would be glad if you could give some information.
hello, like you, I want to receive tweets with certain hashtags with jupyter notebook, when I do the same commands in jupyternotebook, I get an error. Did you use anaconda 3.6 version, I wonder if that's why mine doesn't work. I would be glad if you could give some information.
Is there a way to delete previous comments. It's a bit messy.
Here again:
Just uncomment (remove the '#') line 92 in the url.py file:
('query_source', 'typed_query'),
This solution works for PC (Linux). It does not seem to work on Raspberry Pi and I have no idea why.
This solution worked for me, and I'm using Python IDLE on Windows. Thanks @klojohn!
Fix not working for me, py 3.9.7 on mac
@aarorauark How did you install twint? If you used pip, type "pip3 show twint" into the command line and follow the path shown under "Location". There you'll find a folder named twint and the url.py which you have to modify inside that folder.
This worked for me, thank you. Running on windows, installed with pip
@klojohn Sir, I managed to receive tweets with a code similar to what you said, but it only gives data for a week, I think the url.py file has been changed. It wasn't exactly what you said. To be removed
How would it be for Windows? Have you got it? I've been trying things for months, uninstalling and installing and I don't know what else to do.
@DavidPerea How would it be for Windows? Have you got it? I've been trying things for months, uninstalling and installing and I don't know what else to do.
My twint version is 2.1.21. It works fine for me on Windows after using the fix posted by @klojohn. Shows all/most tweets that I wanted to see.
@DavidPerea ¿Cómo sería para Windows? ¿Lo tienes? Llevo meses probando cosas, desinstalando e instalando y ya no se que mas hacer.
Mi versión twint es 2.1.21. Funciona bien para mí en Windows después de usar la solución publicada por @klojohn . Muestra todos/la mayoría de los tweets que quería ver.
Now it works great with the solution you have indicated. It is wonderful!
Issue Template
Please use this template!
Initial Check
No similar issue found
pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint
;Command Ran
import twint import nest_asyncio nest_asyncio.apply() config = twint.Config() config.Search = "#gis" config.Limit=10000
config.Hide_output=True
config.Until = '2016-12-07'
config.Since = '2021-08-01' config.Store_object = True
twint.run.Search(config)
now you will have some tweets
tweets_as_objects = twint.output.tweets_list
Description of Issue
Environment Details