JoMingyu / google-play-scraper

Google play scraper for Python inspired by <facundoolano/google-play-scraper>
MIT License
757 stars 206 forks source link

Connection reset by peer - URLError #107

Closed anastasiaisakov closed 2 years ago

anastasiaisakov commented 3 years ago

google_play_scraper.1.0.2 Write result of google_play_scraper.1.0.2

Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1346, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1010, in _send_output self.send(msg) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 950, in send self.connect() File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/http/client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1040, in _create self.do_handshake() File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ConnectionResetError: [Errno 54] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/consulting13/PycharmProjects/GPS/test.py", line 100, in process.crawl(request_details('final_1.txt')) File "/Users/consulting13/PycharmProjects/GPS/test.py", line 60, in request_details result_reviews, continuation_token = reviews( File "/Users/consulting13/PycharmProjects/GPS/venv/lib/python3.9/site-packages/google_play_scraper/features/reviews.py", line 93, in reviews review_items, token = _fetch_review_items( File "/Users/consulting13/PycharmProjects/GPS/venv/lib/python3.9/site-packages/google_play_scraper/features/reviews.py", line 36, in _fetch_review_items dom = post( File "/Users/consulting13/PycharmProjects/GPS/venv/lib/python3.9/site-packages/google_play_scraper/utils/request.py", line 24, in post return _urlopen(Request(url, data=data, headers=headers)) File "/Users/consulting13/PycharmProjects/GPS/venv/lib/python3.9/site-packages/google_play_scraper/utils/request.py", line 11, in _urlopen resp = urlopen(obj) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 214, in urlopen return opener.open(url, data, timeout) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 517, in open response = self._open(req, data) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 494, in _call_chain result = func(*args) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1389, in https_open return self.do_open(http.client.HTTPSConnection, req, File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/urllib/request.py", line 1349, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno 54] Connection reset by peer>

Describe the bug I am receiving this error after 500, then later 150 requests.

Code Copy and paste the code that have issue.

 try:
        result = app(
            id.strip(),
            lang='en',  # defaults to 'en'
            country='us'  # defaults to 'us'
            )
    except:
        result = None
    return result

Expected behavior I guess it's expected not to break with the error above.

kluhan commented 2 years ago

From my observations, the problem usually occurs just before, during or after a Google Play Store update when you send too many (more than 10 per second) requests. I don't think there's much you can do about it other than using proxies or slowing down your crawler.

JoMingyu commented 2 years ago

Agree to @kluhan