box / box-python-sdk

Box SDK for Python
http://opensource.box.com/box-python-sdk/
Apache License 2.0
418 stars 215 forks source link

chunk download status fails with ConnectionResetError: [Errno 104] Connection reset by peer error #770

Closed asmirazali closed 1 year ago

asmirazali commented 1 year ago

Description of the Issue

We are using python client.download_zip to download large files from box before deleting the folder. However, if the download takes more than a certain time (I believe it's 15 mins), we get an error on the status with "ConnectionResetError: [Errno 104] Connection reset by peer". The download is completed though, however we need to get the status for confirmation.

Before we delete the folder, we want to make sure that the status state is "succeeded".

A snippet of our code.

        with s3.open(cos_archive_folder + "/<redacted>/" + zip_name_w_ext, 'wb') as ff:
            status = client.download_zip(zip_name,[client.folder(box_id)], ff)
        if status["state"] == "succeeded":
            status_reply= status
            client.folder(folder_id=box_id).delete(recursive=True)
        else:
            status_reply = "Archival Error"

This works well for small folders where the files are small <2-4gb. However, anything about 10gb, we get connection reset when we try to get the status.

Error Message, Including Stack Trace

2022-11-15 16:29:41,106 WARNING [default_network] Request "GET https://api.box.com/2.0/zip_downloads//status" failed with ConnectionError exception: ConnectionError(ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))) 2022-11-15 16:29:41,424 ERROR [actions_component] Traceback (most recent call last): File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 449, in _make_request six.raise_from(e, None) File "", line 3, in raise_from File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 444, in _make_request httplib_response = conn.getresponse() File "/usr/lib64/python3.9/http/client.py", line 1377, in getresponse response.begin() File "/usr/lib64/python3.9/http/client.py", line 320, in begin version, status, reason = self._read_status() File "/usr/lib64/python3.9/http/client.py", line 281, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/lib64/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) File "/usr/lib64/python3.9/ssl.py", line 1242, in recv_into return self.read(nbytes, buffer) File "/usr/lib64/python3.9/ssl.py", line 1100, in read return self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/opt/app-root/lib64/python3.9/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/opt/app-root/lib64/python3.9/site-packages/urllib3/util/retry.py", line 550, in increment raise six.reraise(type(error), error, _stacktrace) File "/opt/app-root/lib64/python3.9/site-packages/urllib3/packages/six.py", line 769, in reraise raise value.with_traceback(tb) File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 449, in _make_request six.raise_from(e, None) File "", line 3, in raise_from File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 444, in _make_request httplib_response = conn.getresponse() File "/usr/lib64/python3.9/http/client.py", line 1377, in getresponse response.begin() File "/usr/lib64/python3.9/http/client.py", line 320, in begin version, status, reason = self._read_status() File "/usr/lib64/python3.9/http/client.py", line 281, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/lib64/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) File "/usr/lib64/python3.9/ssl.py", line 1242, in recv_into return self.read(nbytes, buffer) File "/usr/lib64/python3.9/ssl.py", line 1100, in read return self._sslobj.read(len, buffer) urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/opt/app-root/lib64/python3.9/site-packages/resilient_circuits/actions_component.py", line 90, in _on_task yield result.get() File "/usr/lib64/python3.9/multiprocessing/pool.py", line 771, in get raise self._value File "/usr/lib64/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/opt/app-root/lib64/python3.9/site-packages/resilient_circuits/decorators.py", line 274, in _invoke_app_function for r in fn_results: File "/opt/app-root/lib64/python3.9/site-packages//components/funct_box_closure.py", line 75, in _app_function status = client.download_zip(zip_name,[client.folder(box_id)], ff) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/util/api_call_decorator.py", line 63, in call return method(args, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/client/client.py", line 1476, in download_zip status = self._session.get(created_zip['status_url']).json() File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 90, in get return self.request('GET', url, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 134, in request response = self._prepare_and_send_request(method, url, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 325, in _prepare_and_send_request network_response = self._send_request(request, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 529, in _send_request return super()._send_request(request, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 436, in _send_request network_response = self._network_layer.request( File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/network/default_network.py", line 43, in request request_response=self._session.request(method, url, kwargs), File "/opt/app-root/lib64/python3.9/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, send_kwargs) File "/opt/app-root/lib64/python3.9/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/requests/adapters.py", line 547, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

2022-11-15 16:29:41,425 ERROR [actions_component] <task[functionworker] (<function app_function.call..app_function_decorator.._invoke_app_function at 0x7f78367208b0>, <box_closure[functions.box_closure] (id=70, workflow=playbook_a8f3e4fa_0753_4475_ae0e_2e09cde50be0, user=) 2022-11-15 16:00:38.078000> ='180', box_folder_id='')> (<class 'requests.exceptions.ConnectionError'>): ERROR:

('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) File "/opt/app-root/lib64/python3.9/site-packages/circuits/core/manager.py", line 874, in processTask raise value.extract() File "/opt/app-root/lib64/python3.9/site-packages/resilient_circuits/actions_component.py", line 90, in _on_task yield result.get() File "/usr/lib64/python3.9/multiprocessing/pool.py", line 771, in get raise self._value File "/usr/lib64/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/opt/app-root/lib64/python3.9/site-packages//decorators.py", line 274, in _invoke_app_function for r in fn_results: File "/opt/app-root/lib64/python3.9/site-packages//components/funct_box_closure.py", line 75, in _app_function status = client.download_zip(zip_name,[client.folder(box_id)], ff) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/util/api_call_decorator.py", line 63, in call return method(args, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/client/client.py", line 1476, in download_zip status = self._session.get(created_zip['status_url']).json() File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 90, in get return self.request('GET', url, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 134, in request response = self._prepare_and_send_request(method, url, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 325, in _prepare_and_send_request network_response = self._send_request(request, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 529, in _send_request return super()._send_request(request, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/session/session.py", line 436, in _send_request network_response = self._network_layer.request( File "/opt/app-root/lib64/python3.9/site-packages/boxsdk/network/default_network.py", line 43, in request request_response=self._session.request(method, url, kwargs), File "/opt/app-root/lib64/python3.9/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, send_kwargs) File "/opt/app-root/lib64/python3.9/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, kwargs) File "/opt/app-root/lib64/python3.9/site-packages/requests/adapters.py", line 547, in send raise ConnectionError(err, request=request)

Versions Used

Python SDK: 3.5.0 Python: 3.9.13

lukaszsocha2 commented 1 year ago

Hi @asmirazali, thanks for posting this issue. We are aware of this issue - it is caused by the cloud service, which closes long lasting connections. We reported this to our cloud service vendor and waiting for the fix. In the meantime we are working on SDK side to be able to bypass it. I'll let you know when we will have a fix ready so that you could test it before releasing Python SDK and confirm that it helped. @lukaszsocha2

asmirazali commented 1 year ago

This is my current workaround to keep the connection alive.

import socket from urllib3.connection import HTTPConnection

HTTPConnection.default_socket_options += [ (socket.SOL_SOCKET, socket.SO_KEEPALIVE,1), (socket.IPPROTO_TCP, socket.TCP_KEEPIDLE,60), (socket.IPPROTO_TCP, socket.TCP_KEEPINTVL,60), (socket.IPPROTO_TCP, socket.TCP_KEEPCNT,100), ] client = Client(auth)

So far so good. 28gb download no issues.

lukaszsocha2 commented 1 year ago

Hi @asmirazali, can you please test if this pr fix your issue - could you checkout branch sdk-2732-fix-connection-reset-error and use code from this branch to execute the code like before. Then please tell me if the issue still occurrs? Please don't change default socket options for this test as you did above.

Requests library claims that it can handle keep-alive on its own (source), so what I did in this pr is just catching exceptions raised by requests library and trying to retry them - similar approach was done here. I hope that requests library will be able to fallback from this error by itself by establishing a new connection. However if this won't be the case and the error still will be occurring, then I'll also attach custom socket configuration as part of the fix, like you did.

So I will be very grateful if you could verify if this fix helps. Unfortunately this error is so indeterministic and I wasn't able to reproduce it on my own so I cannot be sure that this error will be gone now. Best, @lukaszsocha2