JustAnotherArchivist / snscrape

A social networking service scraper in Python
GNU General Public License v3.0
4.44k stars 706 forks source link

sntwitter.TwitterSearchScraper stopped working #690

Closed edoardofalchi closed 1 year ago

edoardofalchi commented 1 year ago

Describe the bug

the below code used to work and in 2023 (not sure on which day) suddenly stopped working. The purpose was to retrieve tweets matching a specific query (similarly to the full_archive endpoint of Tweepy - just to explain myself more clearly). The error arise at line 13 when trying to call TwitterSearchScraper. Any help is highly apprecciated! Thanks

How to reproduce

import pandas as pd 
import numpy as np 
import snscrape.modules.twitter as sntwitter 
import datetime 
from tqdm.notebook import tqdm_notebook 
import seaborn as sns 

q ='@Shell AND (#climatechange OR #environment OR #sustainability OR #nature OR #globalwarming OR #savetheplanet OR #climate OR #climatecrisis OR #ecofriendly OR #climateaction) until:2023-01-01 since:2022-01-10 exclude:retweets exclude:replies'

#Using TwitterSearchScraper to scrape data and append tweets to list 
if count == -1:
    for i,tweet in enumerate(tqdm_notebook(sntwitter.TwitterSearchScraper(q).get_items())):
        tweets_list1.append([tweet.date, tweet.id, tweet.content, tweet.user.username,tweet.lang,
                             tweet.hashtags,tweet.replyCount,tweet.retweetCount, 
                             tweet.likeCount,tweet.quoteCount,tweet.media])
else:
    with tqdm_notebook(total=count) as pbar:
        for i,tweet in enumerate(sntwitter.TwitterSearchScraper(q).get_items()): #declare a username
            if i>=count: #number of tweets you want to scrape
                break
            tweets_list1.append([tweet. Date, tweet.id, tweet.content, tweet.user.username,
                                 tweet.lang,tweet.hashtags,tweet.replyCount, 
                                 tweet.retweetCount,tweet.likeCount,tweet.quoteCount,tweet.media])
            pbar.update(1) 

# Creating a dataframe from the tweets list above  
tweets_df1 = pd.DataFrame(tweets_list1, columns=['DateTime', 'TweetId', 'Text', 'Username','Language',
                                                 'Hashtags','ReplyCount','RetweetCount','LikeCount',
                                                 'QuoteCount','Media']) 

Expected behavior

normally should retrieve the tweets given a specific query

Screenshots and recordings

No response

OS / Distro

Windows 10

Output from snscrape --version

0.4.3.20220106

Scraper

sntwitter.TwitterSearchScraper

Backtrace

ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_med

Dump of locals

No response

How are you using snscrape?

CLI

Additional context

It seems to be a general issue with third-party apps. You can read this article as reference https://www.theverge.com/2023/1/13/23553161/third-party-twitter-clients-apps-outage-twitterific-tweetbot If so, can you confirm this is the case also here for snscrape?

JustAnotherArchivist commented 1 year ago

You're using an outdated version of snscrape. Please update, and it should all work fine again. (The issue is unrelated to what you linked.)