Open etemiz opened 4 years ago
Have also been facing this issue. Queries that were returning tweets yesterday are not returning tweets today.
I'm also facing the Same issue! Yesterday it was parsing well, but today it returns 0 tweets
same here 0 tweets
same here 0 tweets
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
+1. That's bad.
INFO: Retrying... (Attempts left: 1) INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=bitcoin&l= INFO: Using proxy 181.211.38.62:47911 INFO: Got 0 tweets for bitcoin.
Parsing may be an issue. Both twitterscraper (0.9.3) and (1.4.0) are failing.
hocam bende bir proje geliştirmiştim projemde ana kısım buna bağlı bu sorunu nasıl düzeltebiliriz
I need help
same here... anyone has a clue for whats going on?
Not yet. I used it for school university project. What will I do during the presentation
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
to
HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
It works for me! Thanks @rubengoeminne, genius!
Thanks
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
I am very thank you. its work.
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
its work.. thanks
hi guys, im a kind of noob and do not have a HEADER in my code... someone can tell how can i implement it?
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
Thanks a lot my friend! This worked for me! You are a genius! Let me share you a beer @rubengoeminne! Paulaner German Beer? Or Negra Modelo Mexican Beer?
hi guys, im a kind of noob and do not have a HEADER in my code... someone can tell how can i implement it?
@toscanopedro The header dictionary: HEADER = {'User-Agent': random.choice(HEADERS_LIST)} is not in your own code, instead it is a line inside the file query.py
Just open the file as TXT, and change the lines, such as @rubengoeminne said. You could search the file in your PC, maybe it will be foun at the path: C:\ProgramData\Anaconda3\Lib\site-packages\twitterscraper
hi guys, im a kind of noob and do not have a HEADER in my code... someone can tell how can i implement it?
@toscanopedro The header dictionary: HEADER = {'User-Agent': random.choice(HEADERS_LIST)} is not in your own code, instead it is a line inside the file query.py
Just open the file as TXT, and change the lines, such as @rubengoeminne said. You could search the file in your PC, maybe it will be foun at the path: C:\ProgramData\Anaconda3\Lib\site-packages\twitterscraper
THX MAN!!!!
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
The modification no longer works for query_user_info. I changed the header dictionary in query.py and still got no information on my list of users.
I faced the same issue. It seems to work now to retrieve the tweets. However I get this error when I want to have user info, using query_user_info : local variable 'user_info' referenced before assignment
Yah it is not working for me. Changed that line in query.py and same issue occurs.
Hi. I have implemented the modification suggested by pumpkinw and the algortihm made progress. It was not scraping anything before modification. But after modification it is scraping, but not everything. It seems it is scraping only some last hours. For example, when I issued:
twitterscraper fascismo --lang pt -p 1 -bd 2020-05-31 -ed 2020-06-01 -o file_name.json
I received tweets corresponding only to hours from 20 up to 23 of day 2020-05-31:
In [12]: df.groupby(df['timestamp'].dt.hour).count()
Out[12]:
has_media hashtags img_urls is_replied ... tweet_url user_id username video_url
timestamp ...
20 956 956 956 956 ... 956 956 956 956
21 2384 2384 2384 2384 ... 2384 2384 2384 2384
22 2100 2100 2100 2100 ... 2100 2100 2100 2100
23 2147 2147 2147 2147 ... 2147 2147 2147 2147
[4 rows x 21 columns]
Somebody know what is going on?
already changed the header from HEADER = {'User-Agent': random.choice(HEADERS_LIST)} to HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'} but still have the same issue 'NoneType' object has no attribute 'user'.
I don't like modifying module's files directly, so instead of that and based on @rubengoeminne's great answer, to fix this issue you just have to add these line of codes to the top of your python script:
import twitterscraper
import random
HEADERS_LIST = [
'Mozilla/5.0 (Windows; U; Windows NT 6.1; x64; fr; rv:1.9.2.13) Gecko/20101203 Firebird/3.6.13',
'Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko',
'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201',
'Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16',
'Mozilla/5.0 (Windows NT 5.2; RW; rv:7.0a1) Gecko/20091211 SeaMonkey/9.23a1pre'
]
twitterscraper.query.HEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
And do your stuff normally:
from twitterscraper import query_tweets
query_tweets("github", 100)
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
This solution seems not to work for me now.
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.This solution seems not to work for me now.
Yeah, unfortunately they close it down.
guys are you sure that you replace the correct arquive? this is still working for me
@toscanopedro where did you replace please ? I replaced in the query.py file, and it's not working... Thanks !
first you have to pip show twitterscraper to discover the locaticon of the twitterscraper directory. the mine was in: "c:\users\pedro\appdata\local\programs\python\python38-32\lib\site-packages" there is a folder called twitterscraper and the query.py arquive. And you have just to change it. the path may change, depends what idle you are using. but he is awais inside a "lib\site-packages"
@toscanopedro I am working on GCP, I changed the arquive manually as shown on the picture. Is is sufficient ?
Thanks
@toscanopedro it doesn't work on my end, unfortunately. I would imagine that you're making your requests to some server that wasn't updated yet, maybe? I'll play with VPNs and check
@toscanopedro it doesn't work on my end, unfortunately. I would imagine that you're making your requests to some server that wasn't updated yet, maybe? I'll play with VPNs and check
Yes...this is what happen... The problem is back here... my code do not work anymore... its very sad..
It doesn't work for me anymore. How to fix?
@toscanopedro it doesn't work on my end, unfortunately. I would imagine that you're making your requests to some server that wasn't updated yet, maybe? I'll play with VPNs and check
Yes...this is what happen... The problem is back here... my code do not work anymore... its very sad..
What will we do? the whole project depends on it...
@toscanopedro it doesn't work on my end, unfortunately. I would imagine that you're making your requests to some server that wasn't updated yet, maybe? I'll play with VPNs and check
Yes...this is what happen... The problem is back here... my code do not work anymore... its very sad..
What will we do? the whole project depends on it...
I dont know... my project depends on it to
Seems Twitter has restricted the connection so that all requests return a page with "We've detected that JavaScript is disabled in your browser. Would you like to proceed to legacy Twitter?"
Indeed, this can be fixed by modifying the header dictionary in query.py from
HEADER = {'User-Agent': random.choice(HEADERS_LIST)}
toHEADER = {'User-Agent': random.choice(HEADERS_LIST), 'X-Requested-With': 'XMLHttpRequest'}
that should fix the issue.
Hi everyone This was working earlier, but it stopped working today - I think twitter is also following this thread :P
Help me please - like everyone above my project also depends on it!
It seems that Twitter has had it enough! The company is shutting down its original site legacy theme version on the 1st of June 2020. Twitter has issued a warning to all the users who have been using user-agent switching hacks and unsupported browsers to enable the legacy theme. Since this package is based on the legacy theme and user-agent, I am not sure whether there exists a one-line solution.
It seems that Twitter has had it enough! The company is shutting down its original site legacy theme version on the 1st of June 2020. Twitter has issued a warning to all the users who have been using user-agent switching hacks and unsupported browsers to enable the legacy theme. Since this package is based on the legacy theme and user-agent, I am not sure whether there exists a one-line solution.
yes, its true... shit... I think this will take months
I was using this library because my Twitter api application was rejected. Now it's too late for everything. I hope it gets better soon.
I have no idea how useful this is, but I know that Get Old Tweets 3 is still largely working, as of June 4th at 5:18 p.m. PST. It does not have a way to grab video or images (which is why I am interested in twitterscraper). Hopefully this provides use for someone either if they don't need images and videos, or if someone can backwards engineer a solution (I am trying to figure it out, but my chops are not to that level yet).
Is there anyway GetOldTweet can retrieve the total number of retweeted post from the specific user? @sagefuentes
Please try this and let me know how it works for you https://github.com/taspinar/twitterscraper/pull/302
It was not working yesterday even after I changed the HEADER in the query.py But Today all of a sudden its working :)
It was not working yesterday even after I changed the HEADER in the query.py But Today all of a sudden its working :)
Yes. very interesting :)
same here, guys. its working again... lets see tomorrow
It works now.
I changed the header in query.py but raise the error "AttributeError: 'NoneType' object has no attribute 'user'", anyone help please..
from twitterscraper.query import query_user_info import pandas as pd from multiprocessing import Pool import time from IPython.display import display
global twitter_user_info twitter_user_info=[]
def get_user_info(twitter_user): """ An example of using the query_user_info method :param twitter_user: the twitter user to capture user data :return: twitter_user_data: returns a dictionary of twitter user data """ user_info = query_user_info(user= twitter_user) twitter_user_data = {} twitter_user_data["user"] = user_info.user twitter_user_data["fullname"] = user_info.full_name twitter_user_data["location"] = user_info.location twitter_user_data["blog"] = user_info.blog twitter_user_data["date_joined"] = user_info.date_joined twitter_user_data["id"] = user_info.id twitter_user_data["num_tweets"] = user_info.tweets twitter_user_data["following"] = user_info.following twitter_user_data["followers"] = user_info.followers twitter_user_data["likes"] = user_info.likes twitter_user_data["lists"] = user_info.lists
return twitter_user_data
def main(): start = time.time() users = ['Carlos_F_Enguix', 'mmtung', 'dremio', 'MongoDB', 'JenWike', 'timberners_lee','ataspinar2', 'realDonaldTrump', 'BarackObama', 'elonmusk', 'BillGates', 'BillClinton','katyperry','KimKardashian']
pool = Pool(8)
for user in pool.map(get_user_info,users):
twitter_user_info.append(user)
cols=['id','fullname','date_joined','location','blog', 'num_tweets','following','followers','likes','lists']
data_frame = pd.DataFrame(twitter_user_info, index=users, columns=cols)
data_frame.index.name = "Users"
data_frame.sort_values(by="followers", ascending=False, inplace=True, kind='quicksort', na_position='last')
elapsed = time.time() - start
print(f"Elapsed time: {elapsed}")
display(data_frame)
if name == 'main': main()
Great, it's working again. But don't hold your breath on this. Find another alternative before it's too late.
Leaving this open because it appears to be on-and-off working. I'll update #302 so js is optional because legacy appears to work sometimes
Merged in https://github.com/taspinar/twitterscraper/pull/304 origin/master
should work now. Please create a new thread if this issue comes up again.
INFO: Retrying... (Attempts left: 1) INFO: Scraping tweets from https://twitter.com/search?f=tweets&vertical=default&q=bitcoin&l= INFO: Using proxy 181.211.38.62:47911 INFO: Got 0 tweets for bitcoin.
Parsing may be an issue. Both twitterscraper (0.9.3) and (1.4.0) are failing.