Open seonake opened 4 years ago
Would fix this today.
Many thanks.
could you please share the exact script that you ran? so I can replicate this issue.
Now I got this: RefreshTokenException: Could not find the Guest token in HTML with twint.run.Lookup(c) an twint.run.Following(c)
/content/src/twint/twint/token.py in refresh(self) 66 else: 67 self.config.Guest_token = None ---> 68 raise RefreshTokenException('Could not find the Guest token in HTML')
RefreshTokenException: Could not find the Guest token in HTML
Are you running this script on the same machine you were previously running on? Are you running it on anaconda or system wide python installation? Also go through this thread first #957
Yes, same script. Google Colab.
!pip install nest_asyncio !pip install --user --upgrade -e git+https://github.com/twintproject/twint.git@master#egg=twint
import twint import nest_asyncio import pandas as pd
nest_asyncio.apply() directorio = '/content/drive/My Drive/TFM/Cuentas/'
DataIn=pd.read_csv(directorio + 'cuentas_inicio.csv')
for fila in DataIn.itertuples(): config = twint.Config() config.Username=fila[2]
config.Pandas = True config.Store_pandas = True config.Hide_output = True
twint.run.Following(config)
Again, same script... back to first error: CRITICAL:root:twint.get:User:'profile_banner_url'
But i have new information, since I am taking users from CSV file... it is working fine with some users, but i got this error for example with user = "Kronprinsparet" ... Maybe It helps...
Again, same script... back to first error: CRITICAL:root:twint.get:User:'profile_banner_url'
But i have new information, since I am taking users from CSV file... it is working fine with some users, but i got this error for example with user = "Kronprinsparet" ... Maybe It helps...
I have fixed the profile_banner_url
error in my branch. My PR for that patch hasn't been merged yet, meanwhile use the below command to install my fix from my brnach
pip3 install --user --upgrade git+https://github.com/himanshudabas/twint.git@origin/fix-parser#egg=twint
As for the Guest_token
error, it happens when twitter blacklists your IP address for making too many requests within a short period of time.
You can confirm this by taking a break of 15 minutes when you get this error, run your script again after 15 minutes. After 15 minutes your script should be working again.
Many thanks!
Still not working, may I help?
Again, same script... back to first error: CRITICAL:root:twint.get:User:'profile_banner_url' But i have new information, since I am taking users from CSV file... it is working fine with some users, but i got this error for example with user = "Kronprinsparet" ... Maybe It helps...
I have fixed the
profile_banner_url
error in my branch. My PR for that patch hasn't been merged yet, meanwhile use the below command to install my fix from my brnachpip3 install --user --upgrade git+https://github.com/himanshudabas/twint.git@origin/fix-parser#egg=twint
As for the
Guest_token
error, it happens when twitter blacklists your IP address for making too many requests within a short period of time.You can confirm this by taking a break of 15 minutes when you get this error, run your script again after 15 minutes. After 15 minutes your script should be working again.
Thanks for your effort! But here I have a problem. What if I need to crawl just the info of 50k twitter users, so I need to send a lot request frequently. Is there any way to overcome it?
@CharleoY The only way I can think of is, *Use a proxylist, when you recieve this Exception, simply rotate your proxy.
I don't know if proxies are working right now in the current implementation of twint.
But this is one of the ways to go.
Moreover for searching userdata, I am planning to add that feature soon, which would allow you to scrape data of around 100 users in 1 single api request. So you can get the details of 50,000 users in merely 500 requests compared to 50,000 requests that you'll need to make right now.
So your IP won't be blacklisted.
It'd take some time to implement though.
Is there a way to retrieve the video URLs in the Tweets?
@gautampal1947 Videos on twitter doesn't have a url.
@gautampal1947 Videos on twitter doesn't have a url.
Seems video URL can be extracted from the embedded video in the iFrame: https://steemit.com/technology/@singhpratyush/fetching-url-for-embedded-twitter-videos
It'd take some time to implement though.
@CharleoY The only way I can think of is, *Use a proxylist, when you recieve this Exception, simply rotate your proxy.
I don't know if proxies are working right now in the current implementation of twint.
But this is one of the ways to go.
Moreover for searching userdata, I am planning to add that feature soon, which would allow you to scrape data of around 100 users in 1 single api request. So you can get the details of 50,000 users in merely 500 requests compared to 50,000 requests that you'll need to make right now.
So your IP won't be blacklisted.
It'd take some time to implement though.
hello,Do you support multiple users with one request now?
EDIT: no problem after all, I misunderstood
@agombert
can you elaborate a little bit on what you are trying to do here?
because this error occurs when you don't provide a Username
or User_id
before calling Lookup
.
I am new to twint so have no Idea what Members_list
does.
also it'd be nice if you could explain how 'manhack/OSINT' works.
Moreover when older twitter endpoints were deprecated (which broke the library), a lot of code in twint changed to fix the library, and due to the lack of proper documentation I wasn't able to grasp how things worked before the library broke.
That's the reason a lot of stuff is in limbo right now.
My bad @himanshudabas I mixed two different things:
I Edit my comment above, your branch works perfectly !
I have fixed the
profile_banner_url
error in my branch. My PR for that patch hasn't been merged yet, meanwhile use the below command to install my fix from my brnachpip3 install --user --upgrade git+https://github.com/himanshudabas/twint.git@origin/fix-parser#egg=twint
@CharleoY I ran into this problem today and your branch solved it! Thank you for putting in the time for a fix, much appreciated 😃
And as a side note, hopefully this gets merged sooner rather than later. I'm eager for Twint's next release as there are quite a few good PRs to be merged.
Hi, It seems there is still some problems with Lookup function. Am I doing something wrong?
Command Ran
twint.run.Lookup(c)
Description of Issue
I got this:
CRITICAL:root:twint.feed:Follow:IndexError CRITICAL:root:twint.feed:Follow:IndexError CRITICAL:root:twint.get:User:'profile_banner_url' ERROR:root:twint.run:Twint:Lookup:Unexpected exception occurred. Traceback (most recent call last): File "/content/src/twint/twint/run.py", line 307, in Lookup await get.User(self.config.Username, self.config, db.Conn(self.config.Database)) File "/content/src/twint/twint/get.py", line 228, in User await Users(j_r, config, conn) File "/content/src/twint/twint/output.py", line 177, in Users user = User(u) File "/content/src/twint/twint/user.py", line 49, in User _usr.background_image = ur['data']['user']['legacy']['profile_banner_url'] KeyError: 'profile_banner_url'
/content/src/twint/twint/user.py in User(ur) 47 _usr.is_verified = ur['data']['user']['legacy']['verified'] 48 _usr.avatar = ur['data']['user']['legacy']['profile_image_url_https'] ---> 49 _usr.background_image = ur['data']['user']['legacy']['profile_banner_url'] 50 # TODO : future implementation 51 # legacy_extended_profile is also available in some cases which can be used to get DOB of user
KeyError: 'profile_banner_url'
Environment Details
Mac / Google colab