twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.75k stars 2.72k forks source link

Error: twint.token.RefreshTokenException: Could not find the Guest token in HTML #1061

Open weslleylira opened 3 years ago

weslleylira commented 3 years ago

Issue Template

Error: twint.token.RefreshTokenException: Could not find the Guest token in HTML

Initial Check

If the issue is a request please specify that it is a request in the title (Example: [REQUEST] more features). If this is a question regarding 'twint' please specify that it's a question in the title (Example: [QUESTION] What is x?). Please only submit issues related to 'twint'. Thanks.

Make sure you've checked the following:

Command Ran

Please provide the exact command ran including the username/search/code so I may reproduce the issue.

Description of Issue

when running Twint I get the following error: raceback (most recent call last):

File "twitter.py", line 18, in twint.run.Search(c) File "/home/ssm-user/.local/lib/python3.8/site-packages/twint/run.py", line 410, in Search run(config, callback) File "/home/ssm-user/.local/lib/python3.8/site-packages/twint/run.py", line 329, in run get_event_loop().run_until_complete(Twint(config).main(callback)) File "/home/ssm-user/.local/lib/python3.8/site-packages/twint/run.py", line 36, in init self.token.refresh() File "/home/ssm-user/.local/lib/python3.8/site-packages/twint/token.py", line 68, in refresh raise RefreshTokenException('Could not find the Guest token in HTML') twint.token.RefreshTokenException: Could not find the Guest token in HTML

Environment Details

Using Ubuntu Server

himanshudabas commented 3 years ago

this seems to me a duplicate of #957

aburak256 commented 3 years ago

I have a similar issue. I am trying to collect some data with twint for time series analysis. To make that, I send searches hour by hour starting from january 2014. After 1000 search I get the same error.

I am using Colab [] Updated Twint with pip3 install --user --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint;

BonfaceKilz commented 3 years ago

Getting the same error. Anyone able to come with a work-around?

orestislampridis commented 3 years ago

Is there a general problem with Twint currently? I seem to get the same error.

theshouryagupta commented 3 years ago

I am having the same issue right now. It was working fine 2 hours ago.

BonfaceKilz commented 3 years ago

I speculate there's yet another internal change within twitter that we aren't aware of...

theshouryagupta commented 3 years ago

Yeah @BonfaceKilz could be a possibility. Which os were you using ? I encountered this on Ubuntu 20.04.01 LTS.

BonfaceKilz commented 3 years ago

@theshouryagupta I'm running ArchLinux; though I've been running twint in a Guix[0] container

[0] https://github.com/pjotrp/guix-notes/blob/master/CONTAINERS.org

orestislampridis commented 3 years ago

I am using Windows 10. I don't think this is an os related problem.

data-z commented 3 years ago

I am getting the same error. It was working fine last night. The I tried to clone the branch form above -himanshudabas commented 9 days ago this seems to me a duplicate of #957 and am getting a different error

theshouryagupta commented 3 years ago

@orestislampridis yes you are right. I tried it on macOS too. Same issue...

data-z commented 3 years ago

how do we install the new commits?

sukioral commented 3 years ago

Getting same problem again

orestislampridis commented 3 years ago

@data-z "pip uninstall twint" and then "pip install git+git://github.com/ajctrl/twint@patch-1". It works for me now!

sukioral commented 3 years ago

RefreshTokenException: Could not find the Guest token in HTML

sukioral commented 3 years ago

@data-z "pip uninstall twint" and then "pip install git+git://github.com/ajctrl/twint@patch-1". It works for me now!

Works for me! Thanks!

data-z commented 3 years ago

@sukioral Thanks works for me too.

xiaozhouliu commented 3 years ago

@orestislampridis Thanks, it works!

irisdemented commented 3 years ago

@data-z "pip uninstall twint" and then "pip install git+git://github.com/ajctrl/twint@patch-1". It works for me now!

This worked for me too! Thank you 👍

WjcSoso0928 commented 3 years ago

@data-z "pip uninstall twint" and then "pip install git+git://github.com/ajctrl/twint@patch-1". It works for me now!

I use this method, but I get new error Error Code:

aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host api.twitter.com:443 ssl:True [Network is unreachable]

I don't know how to fix it

patrickh217 commented 3 years ago

I tried installing it with pip install git+git://github.com/ajctrl/twint@patch-1 but it did not work at first. Then I simply added the line code directly in the script that @ajctrl changed in his pull request. after this little change, it is working for me.

Many thanks

data-z commented 3 years ago

Thats what I did as well.

Sent from my Verizon, Samsung Galaxy smartphone Get Outlook for Androidhttps://aka.ms/ghei36


From: Patrick Hermann notifications@github.com Sent: Thursday, December 17, 2020 3:25:23 AM To: twintproject/twint twint@noreply.github.com Cc: Carter, Tariq - cartto04 cartto04@uwgb.edu; Mention mention@noreply.github.com Subject: Re: [twintproject/twint] Error: twint.token.RefreshTokenException: Could not find the Guest token in HTML (#1061)

I tried installing it with pip install git+git://github.com/ajctrl/twint@patch-1 but it did not work at first. Then I simply added the line code directly in the script that @ajctrlhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fajctrl&data=04%7C01%7Ccartto04%40uwgb.edu%7C4d652d98472a4573d28508d8a2654d16%7C7fc34f9d1f754f96b5b33cdcaab03aea%7C0%7C0%7C637437903266564493%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CBiA7j4CvnxT285U%2F98QIngBvslOEtZTBcKnxeD5Nk8%3D&reserved=0 changed in his pull requesthttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftwintproject%2Ftwint%2Fpull%2F1075%2Fcommits%2F48e8586cb0a49c93a712b86e0824f7db10ec8f35&data=04%7C01%7Ccartto04%40uwgb.edu%7C4d652d98472a4573d28508d8a2654d16%7C7fc34f9d1f754f96b5b33cdcaab03aea%7C0%7C0%7C637437903266574446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=waNn55EooENCeRhhfXSjHKU5Ygh3MV4jhJwY8K1EXQI%3D&reserved=0. after this little change, it is working for me.

Many thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftwintproject%2Ftwint%2Fissues%2F1061%23issuecomment-747289495&data=04%7C01%7Ccartto04%40uwgb.edu%7C4d652d98472a4573d28508d8a2654d16%7C7fc34f9d1f754f96b5b33cdcaab03aea%7C0%7C0%7C637437903266574446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=QtHH67eB1QZAQeBN0xC5Myvj7neKKrPFJAB1X%2Fn8ZhM%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAQ3FBIZVCFIE2EQQJTR65OLSVG57HANCNFSM4UQPMRBA&data=04%7C01%7Ccartto04%40uwgb.edu%7C4d652d98472a4573d28508d8a2654d16%7C7fc34f9d1f754f96b5b33cdcaab03aea%7C0%7C0%7C637437903266574446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xSOFH7XWykyuldKtIcwdQq6tuTSRpxOLc2UF8Ja3dcI%3D&reserved=0.

Vickycats commented 3 years ago

Hi Please could someone give a command line instruction for installing the current version with the patch since pip install git+git://github.com/ajctrl/twint@patch-1 no longer works. I have manually added the new version of Token.py from git, without doing a full Twint re-install but I am getting inconsistent results. For example in the code below, the output works fine if the Username is 'realdonaldtrump' or 'kamalaharris' but I get errors for other (existing) users. For example if Username is 'sainsburys' (a large superstore chain in the UK), or most other names I get a keyerror on the url:

_usr.url = ur['data']['user']['legacy']['url'] KeyError: 'url'

Less often it fails with 'cannot find the Guest token'.

Code used:

c = twint.Config() c.Username = 'realdonaldtrump' # works for some not for others. Most not working. c.Store_object = True c.Store_object_users_list = [] c.User_full = True twint.run.Lookup(c)

rutvikprajapati commented 3 years ago

@Vickycats I manually added the new version of Token.py and now got the below error

CRITICAL:root:twint.run:Twint:Feed:noDataExpecting value: line 1 column 1****

subasish commented 3 years ago

Finding similar error code after updating token.py

hockeybro12 commented 3 years ago

@rutvikprajapati I get the same error as you.

rutvikprajapati commented 3 years ago

I found this article on snscrape library and it is working fine.

BonfaceKilz commented 3 years ago

rutvikprajapati notifications@github.com writes:

I found this article on snscrape library and it is working fine.

FWIW, at one point IIRC, snscrape faced this similar issue; but in a different flavour: https://github.com/JustAnotherArchivist/snscrape/issues/110

-- Bonface M. K. https://www.bonfacemunyoki.com Humble GNU Emacs User / Bearer of scheme-y parens Curator: https://upbookclub.com / Twitter: @BonfaceKilz GPG Key: D4F09EB110177E03C28E2FE1F5BBAE1E0392253F

mpucci92 commented 3 years ago

After updating the token.py script with:

class Token: def init(self, config): self._session = requests.Session() self._session.headers.update({'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'}) self.config = config self._retries = 5 self._timeout = 10

It works now, Thanks!

agombert commented 3 years ago

Same problem on local, even after updating with the patch mentioned above. Anyone got a solution?

olof98johansson commented 3 years ago

Same problem on local, even after updating with the patch mentioned above. Anyone got a solution?

For me, the above update didn't work either, but uninstalling twint and then install via

pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint

worked now!

KazimTibetSar commented 3 years ago

Same problem on local, even after updating with the patch mentioned above. Anyone got a solution?

For me, the above update didn't work either, but uninstalling twint and then install via

pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint

worked now!

at python 3.9.0 i wrote pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint but it didnt worked

olof98johansson commented 3 years ago

Same problem on local, even after updating with the patch mentioned above. Anyone got a solution?

For me, the above update didn't work either, but uninstalling twint and then install via pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint worked now!

at python 3.9.0 i wrote pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint but it didnt worked

Yeah ok :( I use python 3.7.4

mengyao-liu commented 3 years ago

I use Python 3.9 on a Mac and I have solved this problem by maybe one of the following two steps:

In twint/token.py, change _self.session.headers.update({'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0'}) to self._session.headers.update({'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11.1; rv:85.0) Gecko/20100101 Firefox/85.0'})

then uninstall and reinstall twint.

It works for me!

DenisOgr commented 3 years ago

Hi all. I got this error only when running my code in the Google Cloud Function. On my Mac, it works well. I fixed it similarly as described above, changed the _self.session.headers. But I changed it to a Linux browser, not Mac or Windows. I changed:

self.token._session.headers.update({'User-Agent': 'Mozilla/5.0 (X11; Linux ppc64le; rv:75.0) Gecko/20100101 Firefox/75.0'})

I guess the Twitter platform checks the client operating system and if it differs from session.headers, it raises an error. My code works well without twint.token.RefreshTokenException on Linux (on GCP) and Mac platforms.

konnextv commented 3 years ago

Heads up: @DenisOgr 's method works for me inside a Google Cloud Function, but only using python37 as the environment.

EDIT: nevermind, doesn't work with 3.7 either, still getting "Could not find the Guest token in HTML"

HAKANMAZI commented 3 years ago

still same error

raise RefreshTokenException('Could not find the Guest token in HTML') 2021-04-18T21:50:03.559085+00:00 app[web.1]: twint.token.RefreshTokenException: Could not find the Guest token in HTML 2021-04-18T21:50:03.562041+00:00 app[web.1]: 10.47.235.178 - - [18/Apr/2021:21:50:03 +0000] "POST / HTTP/1.1" 500 290 "https://snscrape.herokuapp.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36"

himanshudabas commented 3 years ago

Hi, please check this comment. also you can go through the thread of this comment to get a better understanding of why and when this Guest token issue occurs. I have also mentioned a workaround in the comment.

prhbrt commented 3 years ago

@data-z "pip uninstall twint" and then "pip install git+git://github.com/ajctrl/twint@patch-1". It works for me now!

This patch is pulled into twint now. https://github.com/twintproject/twint/pull/1075

Brata11 commented 2 years ago

This is the error I am getting using Jupyter Notebook

Collecting git+git://github.com/ajctrl/twint@patch-1 Cloning git://github.com/ajctrl/twint (to revision patch-1) to /tmp/pip-req-build-7ew6pbd5 Running command git clone -q git://github.com/ajctrl/twint /tmp/pip-req-build-7ew6pbd5 WARNING: Did not find branch or tag 'patch-1', assuming revision or ref. Running command git checkout -q patch-1 error: pathspec 'patch-1' did not match any file(s) known to git. WARNING: Discarding git+git://github.com/ajctrl/twint@patch-1. Command errored out with exit status 1: git checkout -q patch-1 Check the logs for full command output. ERROR: Command errored out with exit status 1: git checkout -q patch-1 Check the logs for full command output.

magikarp171 commented 2 years ago

This is the error I am getting using Jupyter Notebook

Collecting git+git://github.com/ajctrl/twint@patch-1 Cloning git://github.com/ajctrl/twint (to revision patch-1) to /tmp/pip-req-build-7ew6pbd5 Running command git clone -q git://github.com/ajctrl/twint /tmp/pip-req-build-7ew6pbd5 WARNING: Did not find branch or tag 'patch-1', assuming revision or ref. Running command git checkout -q patch-1 error: pathspec 'patch-1' did not match any file(s) known to git. WARNING: Discarding git+git://github.com/ajctrl/twint@patch-1. Command errored out with exit status 1: git checkout -q patch-1 Check the logs for full command output. ERROR: Command errored out with exit status 1: git checkout -q patch-1 Check the logs for full command output.

I am getting the same error here

febrizky commented 2 years ago

Hi, I'm still getting this error RefreshTokenException: Could not find the Guest token in HTML, try to change __self.session.headers based on here still doesn't work. Does anybody know which user agent string that I should use for Windows 10 x64, and Google Chrome version 96.0.4664.110 ? Thank you.

philliplumod commented 2 years ago

Same problem on local, even after updating with the patch mentioned above. Anyone got a solution?

For me, the above update didn't work either, but uninstalling twint and then install via pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint worked now!

at python 3.9.0 i wrote pip install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint but it didnt worked

still not fixt for me python 3.8

M4573RN04H commented 2 years ago

Has anyone found a workaround yet? It randomly works 50% of the time 10% of the time. The other 90% it does not work 100% of the time.

moxak commented 2 years ago

I modified token.py and it worked without error. please try it below.

  1. download token.py via fixed token.py
  2. open directory of twint lib
  3. move new one to twint lib dir
  4. replace it

[postscript] Since it appeared that getting the guest token using html was failing, I tried to get the guest token using another method.

Even if it works, there may be a slight change in behavior of twint.

kk0walski commented 2 years ago
  1. fixed token.py

It worked for me.

moxak commented 2 years ago

added lines for catching RefreshTokenException

M4573RN04H commented 2 years ago

If anyone still has an issue there is a whole thread with multiple solutions here #1320

ljhOfGithub commented 2 years ago

What's the use of this exception? Can I comment it out? '# raise RefreshTokenException('Could not find the Guest token in HTML')'

Salmanshabbir commented 1 year ago

RefreshTokenException: Could not find the Guest token in HTML Can anyone tell about this error how to solve this please if someone now tell if would be very helpfull