ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
46 stars 27 forks source link

Copying a study fails silently, backend notes "Timeout on reading data from socket" #3470

Closed mrnicegyu11 closed 1 year ago

mrnicegyu11 commented 2 years ago

A user tried to copy a project on osparc.io version 1.37.0. this failed silently (original study unlocked, copy does not show up in the frontend). An investigation showed the following error being logged in the osparc storage microservice:

ERROR:simcore_service_storage.s3_client:Unexpected error in s3 client: 
Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.9/site-packages/aiobotocore/response.py", line 53, in read
    chunk = await self.__wrapped__.content.read(amt if amt is not None else -1)
  File "/home/scu/.venv/lib/python3.9/site-packages/aiohttp/streams.py", line 349, in read
    raise self._exception
  File "/home/scu/.venv/lib/python3.9/site-packages/aiobotocore/httpsession.py", line 178, in send
    response = await self._session.request(
  File "/home/scu/.venv/lib/python3.9/site-packages/aiohttp/client.py", line 559, in _request
    await resp.start(conn)
  File "/home/scu/.venv/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 898, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "/home/scu/.venv/lib/python3.9/site-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socket

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.9/site-packages/simcore_service_storage/s3_utils.py", line 60, in wrapper
    response = await func(self, *args, **kwargs)
  File "/home/scu/.venv/lib/python3.9/site-packages/simcore_service_storage/s3_client.py", line 267, in copy_file
    await self.client.copy(**copy_options)
  File "/home/scu/.venv/lib/python3.9/site-packages/aioboto3/s3/inject.py", line 424, in copy
    await self.upload_fileobj(
  File "/home/scu/.venv/lib/python3.9/site-packages/aioboto3/s3/inject.py", line 371, in upload_fileobj
    raise exception
  File "/home/scu/.venv/lib/python3.9/site-packages/aioboto3/s3/inject.py", line 274, in file_reader
    data = await data_chunk
  File "/home/scu/.venv/lib/python3.9/site-packages/aiobotocore/response.py", line 55, in read
    raise AioReadTimeoutError(endpoint_url=self.__wrapped__.url,
aiobotocore.response.AioReadTimeoutError: Read timeout on endpoint URL: "https://s3.amazonaws.com/BUCKET/PROJECT_UUID/NODE_UUID/electrode_template.sab"

Upon a retry, the copying worked successfully. Upon yet another retry, the same error as printed above (for the same file incidently) re-occured.

Suggested actions:

@sanderegg do you want a graylog alert for this? :D

sanderegg commented 2 years ago

Let's keep it there until we have the current staging version in production as there were a few changes on both frontend/backend. then we should start monitoring.

sanderegg commented 1 year ago

Since there were no more occurences of that issue. I close it.