Open sagefuentes opened 4 years ago
Any alternative solution for it? My masters thesis is on hold because of it. I tried snscrape as mentioned in above comment but it does not return result based on a search query string
I used the below query search and it returns me the links of the tweets.
snscrape twitter-search "#XRP since:2019-12-31 until:2020-09-25" > XRP_Sept_tweets.txt
I obtain the tweet_id and then I used tweepy to extract the tweet as I needed more attributes (may not be the best way to do):
def get_tweets(tweet_ids, currency): # global api statuses = api.statuses_lookup(tweet_ids, tweet_mode="extended") data = get_df() # define your own dataframe # printing the statuses for status in statuses: # print(status.lang) if status.lang == "en": mined = { "tweet_id": status.id, "name": status.user.name, "screen_name": status.user.screen_name, "retweet_count": status.retweet_count, "text": status.full_text, "mined_at": datetime.datetime.now(), "created_at": status.created_at, "favourite_count": status.favorite_count, "hashtags": status.entities["hashtags"], "status_count": status.user.statuses_count, "followers_count": status.user.followers_count, "location": status.place, "source_device": status.source, "coin_symbol": currency } last_tweet_id = status.id data = data.append(mined, ignore_index=True) print(currency, "outputing to tweets", len(data)) data.to_csv( f"Extracted_TWEETS.csv", mode="a", header=not os.path.exists("Extracted_TWEETS.csv"), index=False ) print("..... going to sleep 20s") time.sleep(20)
Note that tweet_ids is a list of 100 tweet ids.
This really works. Many thanks.
Just keep in mind that using snscrape
may return too many results, thus it is better to limit the number of tweet IDs using --max-results
snscrape twitter-search "#XRP since:2019-12-31 until:2020-09-25" > XRP_Sept_tweets.txt
Hello..... I am facing issues with snscrape. I do not have command line environments and I am not able to run the program. Can you please explain step by step on how to run with jupyter notebook? And, getting the tweet ids are enough because I have tweepy to extract the tweets from tweet id.
I am also getting the error module 'functools' has no attribute 'cached_property'
snscrape twitter-search "#XRP since:2019-12-31 until:2020-09-25" > XRP_Sept_tweets.txt
Hello..... I am facing issues with snscrape. I do not have command line environments and I am not able to run the program. Can you please explain step by step on how to run with jupyter notebook? And, getting the tweet ids are enough because I have tweepy to extract the tweets from tweet id.
I am also getting the error module 'functools' has no attribute 'cached_property'
I have (miniconda)[https://docs.conda.io/en/latest/miniconda.html] on Python 3.8. It doesn't work on Python of lower version it seems. Then just install snscrape as follows:
pip3 install snscrape
from the miniconda terminal, you should be able to use snscrape directly:
snscrape twitter-search "#XRP since:2019-12-31 until:2020-09-25" > XRP_Sept_tweets.txt
Hello..... I am facing issues with snscrape. I do not have command line environments and I am not able to run the program. Can you please explain step by step on how to run with jupyter notebook? And, getting the tweet ids are enough because I have tweepy to extract the tweets from tweet id. I am also getting the error module 'functools' has no attribute 'cached_property'
I have (miniconda)[https://docs.conda.io/en/latest/miniconda.html] on Python 3.8. It doesn't work on Python of lower version it seems. Then just install snscrape as follows:
pip3 install snscrape
from the miniconda terminal, you should be able to use snscrape directly:
Thank you very much! It worked!! Thank you once again and I feel grateful for your help! :-)
Any alternative solution for it? My masters thesis is on hold because of it.
What an excellent opportunity to write a chapter about politics of APIs in the context of research! 😅 Your supervisor will have references for literature I am sure (and depending on your field), but you can look at publications from the Digital Methods Initiative at the University of Amsterdam, including people like Anne Helmond.
Hey! @rsafa Is it possible to get large number of tweets like 10,000- 100,000. Is there a way to scrape large numbers?
Hello everyone, Is it possible to use snscrape or some other way to get the tweets for a specified twitter handle within the mentioned date range?
I basically want to find an alternate working way for this below GetoldTweets3 command
GetOldTweets3 --username "barackobama" --since 2015-09-10 --until 2015-09-12
Edited
With snscrape, this works:
snscrape --jsonl twitter-search "from:barackobama since:2015-09-10 until:2015-09-12”> baracktweets.json
or
snscrape twitter-search "from:barackobama since:2015-09-10 until:2015-09-12” > baracktweets.txt
Explanation from the developer: twitter-user is actually just a wrapper around twitter-search using the search term from:username (plus code to extract user information from the profile page)
Hello everyone, Is it possible to use snscrape or some other way to get the tweets for a specified twitter handle within the mentioned date range?
I basically want to find an alternate working way for this below GetoldTweets3 command
GetOldTweets3 --username "barackobama" --since 2015-09-10 --until 2015-09-12
You can use as Twitter search .Twitter user instagram etc check there
On Tue, 29 Sep 2020, 6:54 am Paul R. Pival, notifications@github.com wrote:
With snscrape, this should work, but appears to be timing out for me - your mileage may vary... I may be too impatient :-)
snscrape --jsonl twitter-user "barackobama since:2015-09-10 until:2015-09-12”> baracktweets.json or snscrape twitter-user "barackobama since:2015-09-10 until:2015-09-12” > baracktweets.txt
Hello everyone, Is it possible to use snscrape or some other way to get the tweets for a specified twitter handle within the mentioned date range?
I basically want to find an alternate working way for this below GetoldTweets3 command
GetOldTweets3 --username "barackobama" --since 2015-09-10 --until 2015-09-12
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Mottl/GetOldTweets3/issues/98#issuecomment-700369087, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHJNX46BVLTXOFYMD7CS7NLSIEZOTANCNFSM4RRHTOYA .
You can check here more information https://github.com/JustAnotherArchivist/snscrape
Hi @ppival @shelu16 , thanks for the snscrape reference. I tried it and the twitter-search module works, but it only gives me the list of tweet url, e.g: https://twitter.com/irwanOyong/status/1309516653386842113
Tried the --jsonl and --with-entity but it failed. Any insight to get the item (tweet) details?
Well I continue to have spotty success with snscrape, but I can confirm the following query worked:
snscrape --jsonl twitter-search 'musim-musim since:2020-01-01 until:2020-07-01' > musim-musum.json
That will output json for each tweet such as:
{"url": "https://twitter.com/bibIichor/status/1278110922947493888", "date": "2020-06-30T23:39:24+00:00", "content": "@atermoiends menyebut musim-musim begitu selama nungguin seseorang. \ud83d\ude14\ud83d\udc95", "id": 1278110922947493888, "username": "bibIichor", "outlinks": [], "outlinksss": "", "tcooutlinks": [], "tcooutlinksss": "", "retweetedTweet": null}
As noted in the snscrape installation note, you will require python 3.8 and the development version for --jsonl to work...
Hi @ppival @shelu16 , thanks for the snscrape reference. I tried it and the twitter-search module works, but it only gives me the list of tweet url, e.g: https://twitter.com/irwanOyong/status/1309516653386842113
Tried the --jsonl and --with-entity but it failed. Any insight to get the item (tweet) details?
Hi @ppival @shelu16 , thanks for the snscrape reference. I tried it and the twitter-search module works, but it only gives me the list of tweet url, e.g: https://twitter.com/irwanOyong/status/1309516653386842113
Tried the --jsonl and --with-entity but it failed. Any insight to get the item (tweet) details?
@irwanOyong I was having the same issue, the reason is I wasn't using the development version of snscrape. Be sure to install it with pip3 install git+https://github.com/JustAnotherArchivist/snscrape.git
Once I did that it worked like @ppival said it should.
how can i use Getoldtweets3 again ?
Kind of a weird workaround for Tweepy... but I used snscrape to start obtaining tweets from 'big_ben_clock', which is a bot that tweets every hour (and is relatively consistent). I used the bot's tweets to be able to obtain tweet ids that correspond to specific dates/times. Then I used those tweet/time ids to be able to collect tweets from other users at specific times. I outlined the process I used in a Jupyter Notebook: https://github.com/mwaters166/Twitter_OM_Insight_Project/blob/master/1_Scrape_Tweets_Tweepy_Time_Ids.ipynb, and the time ids for 2020 can be found here (although most of January is missing): https://github.com/mwaters166/Twitter_OM_Insight_Project/blob/master/time_ids.csv. I also tried to automate the process, and there's a run.sh file in the main directory. Let me know if anyone finds a better solution (there's gotta be a better way lol)!
First of all thank you very much for your help, i would like to know if it is possible to extract only a part of --jsonl such as "content". Maybe even the author of the post.
Well I continue to have spotty success with snscrape, but I can confirm the following query worked:
snscrape --jsonl twitter-search 'musim-musim since:2020-01-01 until:2020-07-01' > musim-musum.json
That will output json for each tweet such as:
{"url": "https://twitter.com/bibIichor/status/1278110922947493888", "date": "2020-06-30T23:39:24+00:00", "content": "@atermoiends menyebut musim-musim begitu selama nungguin seseorang. \ud83d\ude14\ud83d\udc95", "id": 1278110922947493888, "username": "bibIichor", "outlinks": [], "outlinksss": "", "tcooutlinks": [], "tcooutlinksss": "", "retweetedTweet": null}
As noted in the snscrape installation note, you will require python 3.8 and the development version for --jsonl to work...
Hi @ppival @shelu16 , thanks for the snscrape reference. I tried it and the twitter-search module works, but it only gives me the list of tweet url, e.g: https://twitter.com/irwanOyong/status/1309516653386842113 Tried the --jsonl and --with-entity but it failed. Any insight to get the item (tweet) details?
hi, it is possible to use GetOldTweets3 again ?
nscrape twitter-search "from:barackobama since:2015-09-10 until:2015-09-12” > baracktweets.txt
Can you please tell me how to get tweets with multiple keywords in search query like "Jobs AND (unemployment OR government)" @ppival
Same is happening to me.. Did someone found solution?
People here that have been using snscrape, can you post any code examples just doing a simple query search in script and not console? The lack of documentation is making this more trial and error as I learn the modules.
People here that have been using snscrape, can you post any code examples just doing a simple query search in script and not console? The lack of documentation is making this more trial and error as I learn the modules.
import snscrape.modules.twitter as sntwitter
for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2015-12-17 until:2020-09-25').get_items()) :
if i > maxTweets :
break
print(tweet.username)
print(tweet.renderedContent)
snscrape.modules.twitter as sntwitter
Honestly, running it in miniconda does not work this code (if you have advice on other software they are welcome). Sorry but I've only been using Python for a short time. Having said that my problem is that if I want to download the tweets for a long period of time I reach the maximum number that can be downloaded with Snscrape, I would like to overcome this problem by putting for example a time lag after a few tweets or something similar.
People here that have been using snscrape, can you post any code examples just doing a simple query search in script and not console? The lack of documentation is making this more trial and error as I learn the modules.
import snscrape.modules.twitter as sntwitter for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2015-12-17 until:2020-09-25').get_items()) : if i > maxTweets : break print(tweet.username) print(tweet.renderedContent)
First of all thank you very much for your help, i would like to know if it is possible to extract only a part of --jsonl such as "content". Maybe even the author of the post.
Well I continue to have spotty success with snscrape, but I can confirm the following query worked:
snscrape --jsonl twitter-search 'musim-musim since:2020-01-01 until:2020-07-01' > musim-musum.json
That will output json for each tweet such as:{"url": "https://twitter.com/bibIichor/status/1278110922947493888", "date": "2020-06-30T23:39:24+00:00", "content": "@atermoiends menyebut musim-musim begitu selama nungguin seseorang. \ud83d\ude14\ud83d\udc95", "id": 1278110922947493888, "username": "bibIichor", "outlinks": [], "outlinksss": "", "tcooutlinks": [], "tcooutlinksss": "", "retweetedTweet": null}
As noted in the snscrape installation note, you will require python 3.8 and the development version for --jsonl to work...Hi @ppival @shelu16 , thanks for the snscrape reference. I tried it and the twitter-search module works, but it only gives me the list of tweet url, e.g: https://twitter.com/irwanOyong/status/1309516653386842113 Tried the --jsonl and --with-entity but it failed. Any insight to get the item (tweet) details?
Yes, it is possible. The .json is a JSON lines file and you might read it with json.loads()
from the json package. A sample code can be found here: https://github.com/JustAnotherArchivist/snscrape/issues/82#issue-708558238
snscrape twitter-search "#XRP since:2019-12-31 until:2020-09-25" > XRP_Sept_tweets.txt
This, doesnt do anything, im in a clean env with python=3.8.5
I have to rework on the following line of the code to get username specific time bound tweets. Can someone help? As of now I am getting the HTTP Error 404: Not Found. USERNAME = "narendramodi" START_DATE = "2019-11-09" END_DATE = "2019-11-14" tweetCriteria = GetOldTweets3.manager.TweetCriteria().setUsername(USERNAME).setSince(START_DATE).setUntil(END_DATE).setMaxTweets(100)
Yes, facing the same issue. Even updated the library. Still not working!
Me too
Same issue here.
Not work for me too
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel
Using code from the above comments.
import snscrape.modules.twitter as sntwitter
import csv
keyword = 'Covid'
maxTweets = 30000
#Open/create a file to append data to
csvFile = open('result.csv', 'a', newline='', encoding='utf8')
#Use csv writer
csvWriter = csv.writer(csvFile)
csvWriter.writerow(['id','date','tweet'])
for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) :
if i > maxTweets :
break
csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent])
csvFile.close()
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel
Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
this is when you are trying to filter by providing two dates, but how do you get all tweets? just by removing the filter criteria?
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
this is when you are trying to filter by providing two dates, but how do you get all tweets? just by removing the filter criteria?
Yes, you can add or remove filters as per your need.
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel
Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
May I ask what if I want to filter the language of the tweet (e.g. only tweet in English)? How can I add the filter for that?
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
May I ask what if I want to filter the language of the tweet (e.g. only tweet in English)? How can I add the filter for that?
add "lang:en" without quotes inside query string
example:
for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'lang:en').get_items()) :
Hello, I feel that the time is not being read in the query (only the date is). I tried this earlier today (tried different time intervals in the same day), it is returning 0 results any idea how to solve it?
for i,tweet in enumerate(sntwitter.TwitterSearchScraper("lebanon since:2020-01-01 00:00:00 until:2020-01-01 06:00:00").get_items())
For those using snscrape
please see this issue about installing from pip
.
To use the --jsonl
flag you must do:
pip3 install --upgrade git+https://github.com/JustAnotherArchivist/snscrape@master
Ref: https://github.com/JustAnotherArchivist/snscrape/issues/77
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel
Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
I've tried to run this code with python 3.8.6 on windows 10 and it didn't give me any result, it makes no erros but i end up with a empty csv (only with the headers), is there something that i might be missing?
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
I've tried to run this code with python 3.8.6 on windows 10 and it didn't give me any result, it makes no erros but i end up with a empty csv (only with the headers), is there something that i might be missing?
Not sure why, but I had the same problem. I replace tweet.renderedContent by tweet.content and it works !
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
I've tried to run this code with python 3.8.6 on windows 10 and it didn't give me any result, it makes no erros but i end up with a empty csv (only with the headers), is there something that i might be missing?
Not sure why, but I had the same problem. I replace tweet.renderedContent by tweet.content and it works !
unfortunately that wasn't my case, but i found the problem and it was about the date filter, i got all the results by removing them but now i can't filter a specific time which is bad.
Edit: Esqueci de dizer isso. Às vezes, o aplicativo me dá um 400: Bad Request, eu o executo novamente e ele produz o HTML como disse antes.
This flashing seems to be related to the random choice of user agent in TweetManager.py where "user_agent = random.choice (TweetManager.user_agents ...". I believe that a loop scanning the user agent list with exception handling solves this problem.
@TamiresMonteiroCD @WelXingz @ahsanspark @Atoxal @SophieChowZZY
I think I solved the problem. I made a few changes to the lines. I collect tweets using a word and location filter. I'm using Python 3.8.6 on Windows 10 and it works fine right now.
import snscrape.modules.twitter as sntwitter
import csv
maxTweets = 3000
#keyword = 'deprem'
#place = '5e02a0f0d91c76d2' #This geo_place string corresponds to İstanbul, Turkey on twitter.
#keyword = 'covid'
#place = '01fbe706f872cb32' This geo_place string corresponds to Washington DC on twitter.
#Open/create a file to append data to
csvFile = open('place_result.csv', 'a', newline='', encoding='utf8')
#Use csv writer
csvWriter = csv.writer(csvFile)
csvWriter.writerow(['id','date','tweet',])
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('deprem + place:5e02a0f0d91c76d2 + since:2020-10-31 until:2020-11-03 -filter:links -filter:replies').get_items()):
if i > maxTweets :
break
csvWriter.writerow([tweet.id, tweet.date, tweet.content])
csvFile.close()
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
I've tried to run this code with python 3.8.6 on windows 10 and it didn't give me any result, it makes no erros but i end up with a empty csv (only with the headers), is there something that i might be missing?
Not sure why, but I had the same problem. I replace tweet.renderedContent by tweet.content and it works !
unfortunately that wasn't my case, but i found the problem and it was about the date filter, i got all the results by removing them but now i can't filter a specific time which is bad.
I'm having the exact same problem. When I remove the date filter it works, but when I have it (exactly how it is in the quoted code), I get no results. Anyone else having this issue or know how to solve it? @burakoglakci it's not clear to me how the changes you made in the code would solve this problem.
**Edit: I think I figured it out. It's simply that there was a small error in the quoted code, you have to put a space before the 'since'
For those who are still struggling to download tweets as csv from snscrape, for me this works absolutely fine. Configurations: Windows 7 SP1 (64 bit) Python 3.8.6 pip3.8 install git+https://github.com/JustAnotherArchivist/snscrape.git Write this code in new Jupyter Notebook and make sure that, it is using Python 3.8.6 Kernel Using code from the above comments.
import snscrape.modules.twitter as sntwitter import csv keyword = 'Covid' maxTweets = 30000 #Open/create a file to append data to csvFile = open('result.csv', 'a', newline='', encoding='utf8') #Use csv writer csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet']) for i,tweet in enumerate(sntwitter.TwitterSearchScraper(keyword + 'since:2020-06-01 until:2020-06-30 -filter:links -filter:replies').get_items()) : if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.renderedContent]) csvFile.close()
I've tried to run this code with python 3.8.6 on windows 10 and it didn't give me any result, it makes no erros but i end up with a empty csv (only with the headers), is there something that i might be missing?
Not sure why, but I had the same problem. I replace tweet.renderedContent by tweet.content and it works !
unfortunately that wasn't my case, but i found the problem and it was about the date filter, i got all the results by removing them but now i can't filter a specific time which is bad.
I'm having the exact same problem. When I remove the date filter it works, but when I have it (exactly how it is in the quoted code), I get no results. Anyone else having this issue or know how to solve it? @burakoglakci it's not clear to me how the changes you made in the code would solve this problem.
**Edit: I think I figured it out. It's simply that there was a small error in the quoted code, you have to put a space before the 'since'
yeah, it should be keyword + ' since:2020-06-01 until:2020-06-30 -filter:links -filter:replies'
really simple, nice catch! :D
@bensilver95 @Niehaus
Absolutely, our queries are working. The codes I added in the previous post were not displayed correctly. If you want to add a location filter to your query,
keyword = 'covid'
keyword + ' place:095534ad3107e0e6 + since:2020-10-20 until:2020-11-04 -filter:links -filter:replies').get_items()):
you can run this query, with this query, you can collect shared tweets about covid from the state of Kentucky. Querying on shorter date ranges, as with GOT, can yield better results. Because in queries where there are too many tweets, twitter can stop responding.
@burakoglakci You can please help me with the querie to get tweets of a specific user?
@Niehaus A query like this works, I hope it works.
import snscrape.modules.twitter as sntwitter import csv maxTweets = 3000
csvFile = open('place_result.csv', 'a', newline='', encoding='utf8')
csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet',])
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:@burakoglakci + since:2015-12-02 until:2020-11-05-filter:links -filter:replies').get_items()):
if i > maxTweets :
break
csvWriter.writerow([tweet.id, tweet.date, tweet.content])
csvFile.close()
Thanks all for the useful comments and help to solve the scraping issue. Does anyone tried scraping replies of tweets? I really appreciate your help.
Hi guys! I'm totally lost: how can I use snscrape to extract tweet from a user in a specific time lapse? I'm a beginner with Python, I have to do this for my thesis: It's three weeks I'm trying to extract this data without success, I tried with tweepy and than with GetOldTweets3 and I've just discovered about this new TwitterApi limit... Can somebody help me please?
@sbif
Hi guys! I'm totally lost: how can I use snscrape to extract tweet from a user in a specific time lapse? I'm a beginner with Python, I have to do this for my thesis: It's three weeks I'm trying to extract this data without success, I tried with tweepy and than with GetOldTweets3 and I've just discovered about this new TwitterApi limit... Can somebody help me please?
Use this query with snscrape:
import snscrape.modules.twitter as sntwitter import csv maxTweets = 3000
csvFile = open('place_result.csv', 'a', newline='', encoding='utf8')
csvWriter = csv.writer(csvFile) csvWriter.writerow(['id','date','tweet',])
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('from:@BillGates + since:2015-12-02 until:2020-11-05-filter:links -filter:replies').get_items()): if i > maxTweets : break csvWriter.writerow([tweet.id, tweet.date, tweet.content]) csvFile.close()
Hello! I am using the last snscrape query, but it is not working for me. I am using @joebiden from 2020-01-01 and I am getting a weird output with just 1 tweet. I am a mac user, if any. I really do not know what is going on. I literally copy-paste the code and change the handle but it does not work. Any hints? Thank you so much!
Hi, I had a script running over the past weeks and earlier today it stopped working. I keep receiving HTTPError 404, but the provided link in the errors still brings me to a valid page. Code is (all mentioned variables are established and the error specifically happens with the Manager when I check via debugging):
tweetCriteria = got.manager.TweetCriteria().setQuerySearch(term)\ .setMaxTweets(max_count)\ .setSince(begin_timeframe)\ .setUntil(end_timeframe) scraped_tweets = got.manager.TweetManager.getTweets(tweetCriteria)
The error message for this is the standard 404 error "An error occured during an HTTP request: HTTP Error 404: Not Found Try to open in browser:" followed by the valid link
As I have changed nothing about the folder, I am wondering if something has happened with my configurations more so than anything else, but wondering if others are experiencing this.