JustAnotherArchivist / snscrape

A social networking service scraper in Python
GNU General Public License v3.0
4.45k stars 706 forks source link

ScraperException #765

Closed PromkitaAsukoso closed 1 year ago

PromkitaAsukoso commented 1 year ago

Describe the bug

Hi there! Used to execute scripts parallely for social media metrics and now no one is working. It seems that Twitter is blocking APIs again (like in january); I've tried to upgrade versions, degrade versions, etc., but that error remains. I just would like to make sure that this error is caused by a Twitter block not a versioning or library issue. Thanks for reading and hope to receive a quick response. Best wishes.

How to reproduce

for i,tweet in enumerate(sntwitter.TwitterSearchScraper("#Arona since:'2023-02-01'").get_items()):
    if(i>10):
        break
    else:

         id_tweet=tweet.url
         print(id_tweet)     

Expected behaviour

ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%23Arona+since%3A%272023-02-01%27&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.

Screenshots and recordings

No response

Operating system

Windows 11

Python version: output of python3 --version

python3

snscrape version: output of snscrape --version

0.6.1.20230314

Scraper

TwitterSearchScraper

Backtrace

No response

Dump of locals

No response

How are you using snscrape?

Module (import snscrape.modules.something in Python code)

Additional context

No response

nanimesh commented 1 year ago

i am facing a similar issue, it's just that i am getting non-200 response (429)

JustAnotherArchivist commented 1 year ago

Please provide a complete log at the debug level using import logging; logging.basicConfig(level = logging.DEBUG).

Write commented 1 year ago

Sorry for my previous unuseful comment.

Debug log enabled :
DEBUG:__main__:Query is (from:afpfr) include:nativeretweets
ERROR:snscrape.base:Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%28from%3Aafpfr%29+include%3Anativeretweets&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe: non-200 status code (404)
CRITICAL:snscrape.base:4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%28from%3Aafpfr%29+include%3Anativeretweets&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.
CRITICAL:snscrape.base:Errors: non-200 status code (404), non-200 status code (404), non-200 status code (404), non-200 status code (404)
Traceback (most recent call last):
  File "/root/twitter/./tiny.py", line 969, in <module>
    fetchTweets(trackedAccounts)
  File "/root/twitter/./tiny.py", line 511, in fetchTweets
    for tweet in sntwitter.TwitterSearchScraper(query).get_items():
  File "/usr/local/lib/python3.9/dist-packages/snscrape/modules/twitter.py", line 1546, in get_items
    for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor):
  File "/usr/local/lib/python3.9/dist-packages/snscrape/modules/twitter.py", line 759, in _iter_api_data
    obj = self._get_api_data(endpoint, apiType, reqParams)
  File "/usr/local/lib/python3.9/dist-packages/snscrape/modules/twitter.py", line 725, in _get_api_data
    r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
  File "/usr/local/lib/python3.9/dist-packages/snscrape/base.py", line 225, in _get
    return self._request('GET', *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/snscrape/base.py", line 221, in _request
    raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%28from%3Aafpfr%29+include%3Anativeretweets&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.
JustAnotherArchivist commented 1 year ago

I'm not sure how you produced that log, but it doesn't contain a single info or debug message from snscrape.

Try this:

import logging
import snscrape.modules.twitter

logging.basicConfig(level = logging.DEBUG)
next(snscrape.modules.twitter.TwitterUserScraper('textfiles', retries = 0).get_items())
lucaslpass commented 1 year ago

This code throws me the following error

lucsa@lucsa-HP-Laptop-14-dq2xxx:~/Documentos/Nao.py(data_analyst)/captura$  /usr/bin/env /bin/python3 /home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher 50945 -- /home/lucsa/Documentos/Nao.py\(data_analyst\)/captura/snscrape.modules.py 
INFO:snscrape.modules.twitter:Retrieving scroll page None
INFO:snscrape.modules.twitter:Retrieving guest token
INFO:snscrape.base:Retrieving https://twitter.com/textfiles
DEBUG:snscrape.base:... with headers: {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.6391 Safari/537.34'}
DEBUG:snscrape.base:... with environmentSettings: {'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): twitter.com:443
DEBUG:snscrape.base:Connected to: ('104.244.42.1', 443)
DEBUG:snscrape.base:Connection cipher: ('TLS_AES_256_GCM_SHA384', 'TLSv1.3', 256)
DEBUG:urllib3.connectionpool:https://twitter.com:443 "GET /textfiles HTTP/1.1" 200 None
INFO:snscrape.base:Retrieved https://twitter.com/textfiles: 200
DEBUG:snscrape.base:... with response headers: {'date': 'Tue, 14 Mar 2023 15:01:30 GMT', 'perf': '7626143928', 'expiry': 'Tue, 31 Mar 1981 05:00:00 GMT', 'pragma': 'no-cache', 'server': 'tsa_b', 'set-cookie': 'guest_id_ads=v1%3A167880609033204863; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:01:30 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, guest_id_marketing=v1%3A167880609033204863; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:01:30 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, personalization_id="v1_B/f0dzqCg2PUZZrPC7no8Q=="; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:01:30 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, guest_id=v1%3A167880609033204863; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:01:30 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, ct0=; Max-Age=-1678806089; Expires=Thu, 01 Jan 1970 00:00:01 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=Lax', 'content-type': 'text/html; charset=utf-8', 'x-powered-by': 'Express', 'cache-control': 'no-cache, no-store, must-revalidate, pre-check=0, post-check=0', 'last-modified': 'Tue, 14 Mar 2023 15:01:30 GMT', 'x-frame-options': 'DENY', 'x-transaction-id': 'a15029b82c47a3df', 'x-xss-protection': '0', 'x-content-type-options': 'nosniff', 'content-security-policy': "connect-src 'self' blob: https://*.pscp.tv https://*.video.pscp.tv https://*.twimg.com https://api.twitter.com https://api-stream.twitter.com https://ads-api.twitter.com https://aa.twitter.com https://caps.twitter.com https://pay.twitter.com https://sentry.io https://ton.twitter.com https://twitter.com https://upload.twitter.com https://www.google-analytics.com https://accounts.google.com/gsi/status https://accounts.google.com/gsi/log https://app.link https://api2.branch.io https://bnc.lt wss://*.pscp.tv https://vmap.snappytv.com https://vmapstage.snappytv.com https://vmaprel.snappytv.com https://vmap.grabyo.com https://dhdsnappytv-vh.akamaihd.net https://pdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://dwo3ckksxlb0v.cloudfront.net https://media.riffsy.com https://*.giphy.com https://media.tenor.com https://c.tenor.com ; default-src 'self'; form-action 'self' https://twitter.com https://*.twitter.com; font-src 'self' https://*.twimg.com; frame-src 'self' https://twitter.com https://mobile.twitter.com https://pay.twitter.com https://cards-frame.twitter.com https://accounts.google.com/ https://client-api.arkoselabs.com/ https://iframe.arkoselabs.com/  https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/; img-src 'self' blob: data: https://*.cdn.twitter.com https://ton.twitter.com https://*.twimg.com https://analytics.twitter.com https://cm.g.doubleclick.net https://www.google-analytics.com https://maps.googleapis.com https://www.periscope.tv https://www.pscp.tv https://media.riffsy.com https://*.giphy.com https://media.tenor.com https://c.tenor.com https://*.pscp.tv https://*.periscope.tv https://prod-periscope-profile.s3-us-west-2.amazonaws.com https://platform-lookaside.fbsbx.com https://scontent.xx.fbcdn.net https://scontent-sea1-1.xx.fbcdn.net https://*.googleusercontent.com; manifest-src 'self'; media-src 'self' blob: https://twitter.com https://*.twimg.com https://*.vine.co https://*.pscp.tv https://*.video.pscp.tv https://dhdsnappytv-vh.akamaihd.net https://pdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://mdhdsnappytv-vh.akamaihd.net https://mpdhdsnappytv-vh.akamaihd.net https://mmdhdsnappytv-vh.akamaihd.net https://dwo3ckksxlb0v.cloudfront.net; object-src 'none'; script-src 'self' 'unsafe-inline' https://*.twimg.com https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://client-api.arkoselabs.com/ https://www.google-analytics.com https://twitter.com https://app.link https://accounts.google.com/gsi/client https://appleid.cdn-apple.com/appleauth/static/jsapi/appleid/1/en_US/appleid.auth.js  'nonce-ZDUwM2EwNTUtYzlmMi00MDg4LWIyMjUtNjkxYjZmNDkwYTc0'; style-src 'self' 'unsafe-inline' https://accounts.google.com/gsi/style https://*.twimg.com; worker-src 'self' blob:; report-uri https://twitter.com/i/csp_report?a=O5RXE%3D%3D%3D&ro=false", 'strict-transport-security': 'max-age=631138519', 'cross-origin-opener-policy': 'same-origin-allow-popups', 'cross-origin-embedder-policy': 'unsafe-none', 'content-encoding': 'gzip', 'x-response-time': '40', 'x-connection-hash': 'eb30c6d5761dbaf636ec92ba2d467096f7e6b4b3fb9eb94b380b4b52ea051f6a', 'transfer-encoding': 'chunked'}
DEBUG:snscrape.base:https://twitter.com/textfiles retrieved successfully
DEBUG:snscrape.modules.twitter:No guest token in response
INFO:snscrape.modules.twitter:Retrieving guest token via API
INFO:snscrape.base:Retrieving https://api.twitter.com/1.1/guest/activate.json
DEBUG:snscrape.base:... with headers: {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.6391 Safari/537.34', 'Authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs=1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA', 'Referer': 'https://twitter.com/search?f=live&lang=en&q=from%3Atextfiles&src=spelling_expansion_revert_click', 'Accept-Language': 'en-US,en;q=0.5'}
DEBUG:snscrape.base:... with environmentSettings: {'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:snscrape.base:Connected to: ('104.244.42.194', 443)
DEBUG:snscrape.base:Connection cipher: ('TLS_AES_256_GCM_SHA384', 'TLSv1.3', 256)
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "POST /1.1/guest/activate.json HTTP/1.1" 200 62
INFO:snscrape.base:Retrieved https://api.twitter.com/1.1/guest/activate.json: 200
DEBUG:snscrape.base:... with response headers: {'date': 'Tue, 14 Mar 2023 15:01:31 GMT', 'perf': '7626143928', 'pragma': 'no-cache', 'server': 'tsa_b', 'expires': 'Tue, 31 Mar 1981 05:00:00 GMT', 'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache, no-store, must-revalidate, pre-check=0, post-check=0', 'last-modified': 'Tue, 14 Mar 2023 15:01:32 GMT', 'content-length': '62', 'x-frame-options': 'SAMEORIGIN', 'content-encoding': 'gzip', 'x-transaction-id': '7330659cfb4f0222', 'x-xss-protection': '0', 'content-disposition': 'attachment; filename=json.json', 'x-content-type-options': 'nosniff', 'x-twitter-response-tags': 'BouncerCompliant', 'strict-transport-security': 'max-age=631138519', 'x-response-time': '12', 'x-connection-hash': '4a9d7fd9ae39d07b2ab8447fbef44fe0f8729f29e1eab77f66934820a23dab5a'}
DEBUG:snscrape.base:https://api.twitter.com/1.1/guest/activate.json retrieved successfully
DEBUG:snscrape.modules.twitter:Using guest token 1635657418536353797
INFO:snscrape.base:Retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe
DEBUG:snscrape.base:... with headers: {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.6391 Safari/537.34', 'Authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs=1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA', 'Referer': 'https://twitter.com/search?f=live&lang=en&q=from%3Atextfiles&src=spelling_expansion_revert_click', 'Accept-Language': 'en-US,en;q=0.5', 'x-guest-token': '1635657418536353797'}
DEBUG:snscrape.base:... with environmentSettings: {'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe HTTP/1.1" 404 92
INFO:snscrape.base:Retrieved https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe: 404
DEBUG:snscrape.base:... with response headers: {'date': 'Tue, 14 Mar 2023 15:01:32 UTC', 'perf': '7626143928', 'server': 'tsa_b', 'content-type': 'application/json;charset=utf-8', 'cache-control': 'no-cache, no-store, max-age=0', 'x-transaction-id': '0c5317e940e02494', 'x-rate-limit-limit': '450', 'x-rate-limit-reset': '1678806992', 'x-rate-limit-remaining': '449', 'strict-transport-security': 'max-age=631138519', 'content-encoding': 'gzip', 'content-length': '92', 'x-response-time': '7', 'x-connection-hash': '4a9d7fd9ae39d07b2ab8447fbef44fe0f8729f29e1eab77f66934820a23dab5a'}
ERROR:snscrape.base:Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe: non-200 status code (404)
CRITICAL:snscrape.base:1 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.
CRITICAL:snscrape.base:Errors: non-200 status code (404)
^CTraceback (most recent call last):
  File "_pydevd_bundle/pydevd_cython.pyx", line 577, in _pydevd_bundle.pydevd_cython.PyDBFrame._handle_exception
  File "_pydevd_bundle/pydevd_cython.pyx", line 312, in _pydevd_bundle.pydevd_cython.PyDBFrame.do_wait_suspend
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2070, in do_wait_suspend
    keep_suspended = self._do_wait_suspend(thread, frame, event, arg, suspend_type, from_this_thread, frames_tracker)
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2106, in _do_wait_suspend
    time.sleep(0.01)
KeyboardInterrupt
^CTraceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/home/lucsa/Documentos/Nao.py(data_analyst)/captura/snscrape.modules.py", line 15, in <module>
    next(snscrape.modules.twitter.TwitterUserScraper('textfiles', retries = 0).get_items())
  File "_pydevd_bundle/pydevd_cython.pyx", line 1457, in _pydevd_bundle.pydevd_cython.SafeCallWrapper.__call__
  File "_pydevd_bundle/pydevd_cython.pyx", line 701, in _pydevd_bundle.pydevd_cython.PyDBFrame.trace_dispatch
  File "_pydevd_bundle/pydevd_cython.pyx", line 1152, in _pydevd_bundle.pydevd_cython.PyDBFrame.trace_dispatch
  File "_pydevd_bundle/pydevd_cython.pyx", line 1135, in _pydevd_bundle.pydevd_cython.PyDBFrame.trace_dispatch
  File "_pydevd_bundle/pydevd_cython.pyx", line 312, in _pydevd_bundle.pydevd_cython.PyDBFrame.do_wait_suspend
  File "/home/lucsa/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 2070, in do_wait_suspend
    keep_suspended = self._do_wait_suspend(thread, frame, event, arg, suspend_type, from_this_thread, frames_tracker)
MykhailoYampolskyi commented 1 year ago

Yep...same here. Worked fine yesterday

JustAnotherArchivist commented 1 year ago

@lucaslpass You are not using the latest snscrape version.

lucaslpass commented 1 year ago

okey

kareemrasheed89 commented 1 year ago

ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%23NigeriaDecides2023+since%3A2023-02-12+until%3A2023-03-14&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up. Traceback: File "/workspace/.heroku/python/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.dict) File "/workspace/app.py", line 126, in for i,tweet in enumerate(sntwitter.TwitterSearchScraper(f"{hashtag} since:{last_30} until:{today}").get_items()): File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 1556, in get_items for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor): File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 737, in _iter_api_data obj = self._get_api_data(endpoint, apiType, reqParams) File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 703, in _get_api_data r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response) File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/base.py", line 257, in _get return self._request('GET', *args, **kwargs) File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/base.py", line 253, in _request raise ScraperException(msg)

kareemrasheed89 commented 1 year ago

ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%23NigeriaDecides2023+since%3A2023-02-12+until%3A2023-03-14&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up. Traceback: File "/workspace/.heroku/python/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.dict) File "/workspace/app.py", line 126, in for i,tweet in enumerate(sntwitter.TwitterSearchScraper(f"{hashtag} since:{last_30} until:{today}").get_items()): File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 1556, in get_items for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor): File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 737, in _iter_api_data obj = self._get_api_data(endpoint, apiType, reqParams) File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/modules/twitter.py", line 703, in _get_api_data r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response) File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/base.py", line 257, in _get return self._request('GET', *args, **kwargs) File "/workspace/.heroku/python/lib/python3.10/site-packages/snscrape/base.py", line 253, in _request raise ScraperException(msg)

This is the bug i keep getting

Write commented 1 year ago

Guys, we are dumb. Just upgrading snscrape fix the issue...

pip install -U snscrape

@JustAnotherArchivist, I present you my sincere apologies. And thank you again for your massive work. Must be annoying to deal with dumbies like us.

MykhailoYampolskyi commented 1 year ago

my version is 0.6.1.20230314, so as Promkita's. still not fixed. The problem must be on Twitter's end.

Write commented 1 year ago

my version is 0.6.1.20230314, so as Promkita's. still not fixed. The problem must be on Twitter's end.

Send proper debug log then as seen here >

I'm not sure how you produced that log, but it doesn't contain a single info or debug message from snscrape.

Try this:

import logging
import snscrape.modules.twitter

logging.basicConfig(level = logging.DEBUG)
next(snscrape.modules.twitter.TwitterUserScraper('textfiles', retries = 0).get_items())
MykhailoYampolskyi commented 1 year ago

Here is the log:

INFO:snscrape.modules.twitter:Retrieving scroll page None
DEBUG:snscrape.modules.twitter:Using guest token 1635654757783150593
INFO:snscrape.base:Retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe
DEBUG:snscrape.base:... with headers: {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.9269 Safari/537.44', 'Authorization': 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs=1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA', 'Referer': 'https://twitter.com/search?f=live&lang=en&q=from%3Atextfiles&src=spelling_expansion_revert_click', 'Accept-Language': 'en-US,en;q=0.5', 'x-guest-token': '1635654757783150593'}
DEBUG:snscrape.base:... with environmentSettings: {'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:snscrape.base:Connected to: ('104.244.42.194', 443)
DEBUG:snscrape.base:Connection cipher: ('TLS_AES_256_GCM_SHA384', 'TLSv1.3', 256)
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe HTTP/1.1" 404 92
INFO:snscrape.base:Retrieved https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe: 404
DEBUG:snscrape.base:... with response headers: {'date': 'Tue, 14 Mar 2023 15:56:03 UTC', 'perf': '7626143928', 'server': 'tsa_o', 'set-cookie': 'guest_id_marketing=v1%3A167880936363182687; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:56:03 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, guest_id_ads=v1%3A167880936363182687; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:56:03 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, personalization_id="v1_GHXMS3K3zz0NIqYWMRPsig=="; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:56:03 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, guest_id=v1%3A167880936363182687; Max-Age=63072000; Expires=Thu, 13 Mar 2025 15:56:03 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None', 'content-type': 'application/json;charset=utf-8', 'cache-control': 'no-cache, no-store, max-age=0', 'x-transaction-id': '6fe28151cd5b3bd2', 'x-rate-limit-limit': '450', 'x-rate-limit-reset': '1678809603', 'x-rate-limit-remaining': '428', 'strict-transport-security': 'max-age=631138519', 'content-encoding': 'gzip', 'content-length': '92', 'x-response-time': '111', 'x-connection-hash': '1abdcd9f837fc59cf9dff2496335f68c8c7d8b6d3373173683c1523de1b1ff0e'}
ERROR:snscrape.base:Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe: non-200 status code (404)
CRITICAL:snscrape.base:1 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=from%3Atextfiles&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.
CRITICAL:snscrape.base:Errors: non-200 status code (404)
JustAnotherArchivist commented 1 year ago

@MykhailoYampolskyi You are not using the latest version of snscrape, 0.6.1. You are using 0.6.0 or older.

MykhailoYampolskyi commented 1 year ago

But it shows 0.6.1.20230314? Is it the latest?

JustAnotherArchivist commented 1 year ago

That is the latest, but the debug log was not created with that version. Perhaps you have two different versions of snscrape installed, and your python3 foo.py imports a different one than snscrape --version executes.

gchartung commented 1 year ago

upgrading fixed the error for me as well. :embarrassed:

MykhailoYampolskyi commented 1 year ago

Ok, I'll try to sort it out. Thank you JustAnotherArchivist

Edit: Fixed

kareemrasheed89 commented 1 year ago

@JustAnotherArchivist i just retried the dev version of snscrape and its working now...Thanks for support

JustAnotherArchivist commented 1 year ago

@kareemrasheed89 It would work with the release as well. There's no difference between the two currently.

JustAnotherArchivist commented 1 year ago

Until someone produces a debug log with version 0.6.1 showing the issue, I will consider this invalid.