Closed ghost closed 3 years ago
Seems to work fine for me. Please try to reproduce this with the CLI and upload the dump file it produces on the error.
Seems to work fine for me. Please try to reproduce this with the CLI and upload the dump file it produces on the error.
It worked fine for several hours. I also tried the following command, which still worked fine at the beginning but reported the same error after several hours.
snscrape --jsonl --progress --max-results 50000000000 --since 2021-01-01 twitter-search "vaccine until:2021-05-31" > text-query-vaccine_tweets.json
Can we troubleshoot without the dump file? Sorry but not sure how to get the dump file.
With the CLI, you should see a FATAL
-level log message like Dumped stack and locals to $FILENAME
on the crash.
With the CLI, you should see a
FATAL
-level log message likeDumped stack and locals to $FILENAME
on the crash.
The following is all I got but still did not see the dumple file.
2021-08-30 00:42:39.215 ERROR snscrape.base Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaCwKzF_t-g5SYWgoC58fy05eomEnEVu-SXBBWAiXoYBE5FV1M1ARX8tAMVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel: non-200 status code
2021-08-30 00:42:39.215 CRITICAL snscrape.base 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaCwKzF_t-g5SYWgoC58fy05eomEnEVu-SXBBWAiXoYBE5FV1M1ARX8tAMVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel failed, giving up.
Traceback (most recent call last):
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\_cli.py", line 115, in _dump_locals_on_exception
yield
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\_cli.py", line 280, in main
for i, item in enumerate(scraper.get_items(), start = 1):
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\modules\twitter.py", line 521, in get_items
for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', params, paginationParams, cursor = self._cursor):
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\modules\twitter.py", line 232, in _iter_api_data
obj = self._get_api_data(endpoint, reqParams)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\modules\twitter.py", line 202, in _get_api_data
r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\base.py", line 199, in _get
return self._request('GET', *args, **kwargs)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\base.py", line 195, in _request
raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaCwKzF_t-g5SYWgoC58fy05eomEnEVu-SXBBWAiXoYBE5FV1M1ARX8tAMVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel failed, giving up.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\zw60\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\zw60\anaconda3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\zw60\Anaconda3\Scripts\snscrape.exe\__main__.py", line 7, in <module>
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\_cli.py", line 300, in main
print(f'Finished, {i} results', file = sys.stderr)
File "c:\users\zw60\anaconda3\lib\contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\_cli.py", line 119, in _dump_locals_on_exception
name = _dump_stack_and_locals(trace[1:], exc = e)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\_cli.py", line 149, in _dump_stack_and_locals
fp.write(varRepr.replace('\n', '\n '))
File "c:\users\zw60\anaconda3\lib\tempfile.py", line 473, in func_wrapper
return func(*args, **kwargs)
File "c:\users\zw60\anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 10420-10425: character maps to <undefined>
Ah yes, Windows's madness strikes again. That's #122. Please fix your terminal and retry.
Ah yes, Windows's madness strikes again. That's #122. Please fix your terminal and retry.
I changed character encoding in the Windows to UTF-8. Still received the following error;
Scraping, 461300 results so far
2021-08-30 14:03:31.588 ERROR snscrape.base Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaEwKzVwM6B5iYWgoC58fy05eomEnEV-465AxWAiXoYBE5FV1M1ARWi9gIVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel: non-200 status code
2021-08-30 14:03:31.588 CRITICAL snscrape.base 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaEwKzVwM6B5iYWgoC58fy05eomEnEV-465AxWAiXoYBE5FV1M1ARWi9gIVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel failed, giving up.
2021-08-30 14:03:31.696 CRITICAL snscrape._cli Dumped stack and locals to C:\Users\zw60\AppData\Local\Temp\snscrape_locals_r5sdwzso
Traceback (most recent call last):
File "c:\users\zw60\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\zw60\anaconda3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\zw60\Anaconda3\Scripts\snscrape.exe\__main__.py", line 7, in <module>
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\_cli.py", line 280, in main
for i, item in enumerate(scraper.get_items(), start = 1):
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\modules\twitter.py", line 521, in get_items
for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', params, paginationParams, cursor = self._cursor):
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\modules\twitter.py", line 232, in _iter_api_data
obj = self._get_api_data(endpoint, reqParams)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\modules\twitter.py", line 202, in _get_api_data
r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\base.py", line 199, in _get
return self._request('GET', *args, **kwargs)
File "c:\users\zw60\anaconda3\lib\site-packages\snscrape\base.py", line 195, in _request
raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaEwKzVwM6B5iYWgoC58fy05eomEnEV-465AxWAiXoYBE5FV1M1ARWi9gIVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel failed, giving up.
That's expected. The fourth line is that message with the dump file path. (Sorry, it's CRITICAL
, not FATAL
as in the code. Apparently that's an alias in logging
.)
That's expected. The fourth line is that message with the dump file path.
Can I share the dump file with you through google drive? if so, could you please provide your email address?
I've sent you an email.
I've sent you an email.
Thank you! The dump file was attached to my reply.
Thank you. I can't tell what the underlying cause is, but the issue leading to the crash appears to be the same as what @eyaler reported in #247. I'll try again to reproduce this.
I managed to reproduce it. It happens after exactly 3 hours. Fix coming shortly.
Hello, just wonder if this issue has been fixed. I still have the problem.
Yes, the commit referenced above fixed this, so the current development version should work.
Hi there. Nice work with the package btw. I'm using the current development version and I still run into the problem sometimes
I am still getting this issue.
Please provide the output of snscrape --version
?
If you would kindly assist. It was working earlier in the day then this started.
ScraperException Traceback (most recent call last)
4 frames /usr/local/lib/python3.8/dist-packages/snscrape/base.py in _request(self, method, url, params, data, headers, timeout, responseOkCallback, allowRedirects, proxies) 215 msg = f'{self._retries + 1} requests to {req.url} failed, giving up.' 216 logger.fatal(msg) --> 217 raise ScraperException(msg) 218 raise RuntimeError('Reached unreachable code') 219
Same issue
If you would kindly assist. It was working earlier in the day then this started.
ERROR:snscrape.base:Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%23CRM+OR+%23Salesforce%2C+since%3A2022-01-01+until%3A2023-01-10&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo: non-200 status code
CRITICAL:snscrape.base:4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%23CRM+OR+%23Salesforce%2C+since%3A2022-01-01+until%3A2023-01-10&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo failed, giving up. ScraperException Traceback (most recent call last) in 5 6 #Using TwitterSearchScraper to scrape data and append tweets to list ----> 7 for i,tweet in enumerate(sntwitter.TwitterSearchScraper('#CRM OR #Salesforce, since:2022-01-01 until:2023-01-10').get_items()): 8 if i>250000: 9 break
4 frames /usr/local/lib/python3.8/dist-packages/snscrape/base.py in _request(self, method, url, params, data, headers, timeout, responseOkCallback, allowRedirects, proxies) 215 msg = f'{self._retries + 1} requests to {req.url} failed, giving up.' 216 logger.fatal(msg) --> 217 raise ScraperException(msg) 218 raise RuntimeError('Reached unreachable code') 219
Locking this because people keep using it for things that aren't what the original issue was about.
python 3.8, windows 10 pro
I got the following error:
Error retrieving https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+since%3A2021-01-01+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaAgKmRi82l5SYWgoC58fy05eomEnEV_5iSBBWAiXoYBE5FV1M1ARWasQMVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel: non-200 status code
4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+since%3A2021-01-01+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaAgKmRi82l5SYWgoC58fy05eomEnEV_5iSBBWAiXoYBE5FV1M1ARWasQMVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel failed, giving up.
ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&send_error_codes=true&simple_quoted_tweets=true&q=vaccine+since%3A2021-01-01+until%3A2021-05-31&tweet_search_mode=live&count=100&query_source=spelling_expansion_revert_click&cursor=scroll%3AthGAVUV0VFVBaAgKmRi82l5SYWgoC58fy05eomEnEV_5iSBBWAiXoYBE5FV1M1ARWasQMVAAA%3D&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel failed, giving up.
I used the following code: