Closed ExtremeSRL closed 1 year ago
Actually the error returned by twitter is : {"errors":[{"message":"Bad Authentication data","code":215}]}
Related to medialab/minet#682
Related to medialab/minet#682
Don't think it is related to the latest tab as it's not something new. Snscrape was already aware about that.
Actually the error returned by twitter is : {"errors":[{"message":"Bad Authentication data","code":215}]}
No, it isn't. Why do people keep opening the API URLs in browsers and expecting it to work despite lacking the relevant authentication headers?
Yes, looks like Twitter removed the 'latest' search again. Unless they reverse that, it's unlikely that there is a fix for this. Cf. #634 for the previous occurrence of this a couple months ago.
For the record, the error returned by Twitter is:
{"errors":[{"code":32,"message":"Could not authenticate you."}]}
Same error in TwitterUserScraper. Not only search. Moreover, there was a short period of time when it worked, but after about 10 minutes it broke again
It is only the latest search, but twitter-user
, twitter-hashtag
, and a few more scrapers are simple wrappers around the search, so yes, they're also affected.
I noticed that it works, but it is very unstable. It gives an error (non-200 (401)) in 2 out of 3 requests, but it works fine on 3.
It working again =)
Haven't seen any further interruptions, but I'll keep the issue open and pinned for now in case it returns.
No more issues since Saturday. :-)
twitter these days is making changes to the business plans and I guess therefore also to the API. Let's stay tuned because I'm afraid there will be more problems. In the meantime always thanks for your great work!
The problem just came back! 401 on search by query
I have issues too, but for me it alternates between working and not working (as if my internet connection was unstable, but that is not the case).
I only have this problem with twitter-user. twitter-search runs fine (if I don't use any parameter like from:USERNAME)
@rmnhg No, it happens with both. twitter-user is a very thin wrapper around twitter-search anyway; wouldn't make any sense if they didn't behave the same (unless they were restricting specifically from:X
queries, which isn't the case). You probably just got lucky on your twitter-search runs and unlucky on the twitter-user ones.
It works, but it is very unstable. Apparently Twitter is doing some work on its servers again.
What I see is that unauthenticated searches fail even in the browser. "Your account may not be allowed to perform this action. Please refresh the page and try again."
I set a delay of 1 minute between each tweet and it works.
Issue seems to persist for me, every other request returns data. As @kooperalan said, adding a delay seems to work. For me, adding a 10 seconds delay has completely removed the problem for me.
I can't reproduce it anymore since a few minutes ago. All my test searches seem to succeed now.
Edit: Nevermind, still happens.
It looks like the problem will persist until the work on the servers is completed
p.s. delay doesn't work for me (=
for how long because i need the data for my thesis ?
It looks like the problem will persist until the work on the servers is completed
p.s. delay doesn't work for me (=
What are you referring to by "work on the servers"? Was there any communication about it ? Seems unlikely since its a private API.
Hope it ends well. AntoinePaix, please do something to save us ! 🤣
It works again for me ! 🚀
EDIT: My bad, still facing it....
EDIT2 : works again, it seems it's completely random.
For me too (through a lambda function in eu-west-3)
I have the same error with my own scraper and a complete different implementation (I use http). The 'top' tab works well but not the 'latest' tab.
But last night the advanced search was working fine with 'latest'...
I have the same error with my own scraper and a complete different implementation (I use http). The 'top' tab works well but not the 'latest' tab.
But last night the advanced search was working fine with 'latest'...
What's the advantage of using http implementation ?
Ooops, I meant httpx. It's a python client with nice features such as request/response hooks, http2 and async capabilities.
Ooops, I meant httpx. It's a python client with nice features such as request/response hooks, http2 and async capabilities.
Oh async fort multi threading ? Response hooks for tweet responses ?
@Hesko123 async like if you want to run multiple scrapers inside one thread.
Twitter's problem with the 'latest' search is really episodic. I just ran my personal scraper several times, the first 2 failed but the third passed without issue.
httpx
Oh yeah so at a point it works even if it intended to not work first
@Hesko123 The response hook system of httpx is designed to call a function just before the request is sent or just after you receive a response.
@Hesko123 The response hook system of httpx is designed to call a function just before the request is sent or just after you receive a response.
Merci le boss ! J'ai vue que tu venais de jvc aussi ;)
@Hesko123 On vit dans un petit monde ^^
@Hesko123 On vit dans un petit monde ^^
Tbh I am pretty scared since Elon musk acquired twitter. You told me that this case is occasional but why is it happening, do we have a workaround for this issue ? It seems to be twitter side and we can't do anything on top of that.0.
Btw do you have a discord ?
@Hesko123 async like if you want to run multiple scrapers inside one thread.
Twitter's problem with the 'latest' search is really episodic. I just ran my personal scraper several times, the first 2 failed but the third passed without issue.
It seems almost random (just worked 1/5 times for me). I'm wondering if it has something to do with the user-agent, because I noticed that it's set randomly.
EDIT: maybe not, I just tried setting the user-agent to one of the ones that worked, and seems to fail repeatedly anyway
It's quite weird but when I copy as curl the request made to the adaptive.json
API, if I remove the cookies I have 1/3 the authentication error.
But if I put the cookies back with only the "guest_id" cookie I have the impression that I no longer have the authentication problem...
The guest_id
cookie is set when you do a request to the frontend endpoint of advanced search API like : https://twitter.com/search?q=ukraine&src=typed_query&f=live
I see snscrape call twitter by using twitter api "https://api.twitter.com/2/search/adaptive.json",so is it going to be affected by twitter new policy with very small free rate limit. Is there a plan to fix this problem, like supporting scrape twitter by webpage (e.g: https://twitter.com/search?q=from%3Aelonmusk&src=typed_query&f=live)?
@dengkefeng #695
@AntoinePaix Negative, I'm also seeing failures with the guest_id
cookie set.
@dengkefeng #695
Got it, thanks @JustAnotherArchivist very much. So how do we resolve the main issue in this thread? Just wait for twitter to come back? Thanks!
@JustAnotherArchivist Ah darn. Finally it is rather good news, it means that it is not necessarily a problem related to a new method of authentication.
the search function is no more accessible if you are not logged in . . so I suppose it is the end for snscrape
The error is now blocked (403)
, and all requests seem to be affected. No indication of what's happening on the web interface though, just 'Please refresh the page and try again', so it may well be unintentional.
Maybe earlier today we were witnessing a canary deploy that directed x% of the traffic to this new version where the search feature and the guest token are no longer available :disappointed:
@JustAnotherArchivist, can you scrape tweets from a profile page? (by a direct link to this profile)
https://developer.twitter.com/en/products/twitter-api
100$/month for 10k tweet read limit cap and they limit almost all the guest searches by now ... LOL
Describe the bug
twitter search stop working
How to reproduce
twitter search scraper
Expected behaviour
retrieve twitter post
Screenshots and recordings
Operating system
windows 10
Python version: output of
python3 --version
3.7.5
snscrape version: output of
snscrape --version
0.6.1.20230315.dev2+gedac5f3
Scraper
twitter-search
How are you using snscrape?
CLI (
snscrape ...
as a command, e.g. in a terminal)Backtrace
No response
Log output
Dump of locals
Additional context
none