ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
43 stars 27 forks source link

SSL transport context is `None` #5920

Open bisgaard-itis opened 3 months ago

bisgaard-itis commented 3 months ago

There are some tests in the python client e2e tests which are randomly failing on osparc.speag.com. I can see this happens because of timeouts when communicating with the webserver. When looking into why this is happening I notice this error occurs quite a lot in the webserver:

Task exception was never retrieved
future: <Task finished name='Task-71555324' coro=<RequestHandler.start() done, defined at /home/scu/.venv/lib/python3.10/site-packages/aiohttp/web_protocol.py:462> exception=AttributeError("'NoneType' object has no attribute 'get_extra_info'")>
Traceback (most recent call last):
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/web_protocol.py", line 505, in start
    request = self._request_factory(message, payload, self, writer, handler)
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/web_app.py", line 446, in _make_request
    return _cls(
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/web_request.py", line 811, in __init__
    super().__init__(*args, **kwargs)
  File "/home/scu/.venv/lib/python3.10/site-packages/aiohttp/web_request.py", line 189, in __init__
    self._transport_sslcontext = transport.get_extra_info("sslcontext")
AttributeError: 'NoneType' object has no attribute 'get_extra_info'

Does anyone have an idea what could be going wrong here. I am not sure this is the cause of the timeouts I am seeing from the webserver, but I can see this happens quite a lot. So it would probably be good to understand what's going on. The log doesn't contain more than this. To me it looks like a connection issue :thinking_face:

_Issue created from a Mattermost message by @bisgaard-itis._

bisgaard-itis commented 3 months ago

N.B. we previously upgraded aiohttp for exactly this reason (https://github.com/ITISFoundation/osparc-simcore/pull/4544), but it seems the fix provided there didn't solve the issue for good. We are not the only ones seeing this type of issue and from that conversation it looks like it could be a low level bug in cpython (perhaps a race condition).

Judging from this I suspect this is not something which can be fixed in osparc-simcore. So I guess the task associated with this issue is to either find a way to mitigate the problem or help with a fix in aiohttp/cpython.