ChimeraPy / Engine

Distributed computing framework for Multimodal data written in Python
https://chimerapy-engine.readthedocs.io
GNU General Public License v3.0
8 stars 0 forks source link

ServerDisconnected Error on Worker Stop #288

Open umesh-timalsina opened 10 months ago

umesh-timalsina commented 10 months ago

It seems like calls to manager.async_stop sometimes raises a server disconnect error.

  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/networking/server.py", line 290, in _websocket_handler
    await handler(msg, ws)
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/worker/http_server_service.py", line 322, in _async_node_status_update
    await self.eventbus.asend(Event("WorkerState.changed", self.state))
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/eventbus/eventbus.py", line 63, in asend
    await self.stream.asend(event)
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/aioreactive/subject.py", line 125, in asend
    await obv.asend(value)
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/eventbus/eventbus.py", line 180, in asend
    await self.exec_callable(self._on_asend)
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/eventbus/eventbus.py", line 157, in exec_callable
    await func(*arg, **kwargs)
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/worker/http_client_service.py", line 347, in _async_node_status_update
    async with self.http_client.post(
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/aiohttp/client.py", line 1167, in __aenter__
    self._resp = await self._coro
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/aiohttp/client.py", line 586, in _request
    await resp.start(conn)
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 905, in start
    message, payload = await protocol.read()  # type: ignore[union-attr]
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/aiohttp/streams.py", line 616, in read
    await self._waiter
aiohttp.client_exceptions.ServerDisconnectedError: Server disconnected

2023-10-24 15:20:31 [ERROR] chimerapy-engine: Traceback (most recent call last):
  File "/home/umesh/mambaforge/envs/chimerapy-dev-stable/lib/python3.10/site-packages/chimerapy/engine/manager/worker_handler_service.py", line 527, in _broadcast_request
    result = t.result()
asyncio.exceptions.InvalidStateError: Result is not set.
edavalosanaya commented 10 months ago

I was able to replicate the bug just using the examples provided in the engine:

Running the following command results in the error:

# In Chimerapy/Engine GitHub folder
python examples/async_remote_camera.py
File "/home/nicole/GitHub/ChimeraPy/Engine/chimerapy/engine/networking/server.py", line 290, in _websocket_handler
    await handler(msg, ws)
  File "/home/nicole/GitHub/ChimeraPy/Engine/chimerapy/engine/worker/http_server_service.py", line 322, in _async_node_status_update
    await self.eventbus.asend(Event("WorkerState.changed", self.state))
  File "/home/nicole/GitHub/ChimeraPy/Engine/chimerapy/engine/eventbus/eventbus.py", line 63, in asend
    await self.stream.asend(event)
  File "/home/nicole/anaconda3/envs/chimerapy_dev/lib/python3.9/site-packages/aioreactive/subject.py", line 125, in asend
    await obv.asend(value)
  File "/home/nicole/GitHub/ChimeraPy/Engine/chimerapy/engine/eventbus/eventbus.py", line 180, in asend
    await self.exec_callable(self._on_asend)
  File "/home/nicole/GitHub/ChimeraPy/Engine/chimerapy/engine/eventbus/eventbus.py", line 157, in exec_callable
    await func(*arg, **kwargs)
  File "/home/nicole/GitHub/ChimeraPy/Engine/chimerapy/engine/worker/http_client_service.py", line 347, in _async_node_status_update
    async with self.http_client.post(
  File "/home/nicole/anaconda3/envs/chimerapy_dev/lib/python3.9/site-packages/aiohttp/client.py", line 1141, in __aenter__
    self._resp = await self._coro
  File "/home/nicole/anaconda3/envs/chimerapy_dev/lib/python3.9/site-packages/aiohttp/client.py", line 560, in _request
    await resp.start(conn)
  File "/home/nicole/anaconda3/envs/chimerapy_dev/lib/python3.9/site-packages/aiohttp/client_reqrep.py", line 914, in start
    self._continue = None
  File "/home/nicole/anaconda3/envs/chimerapy_dev/lib/python3.9/site-packages/aiohttp/helpers.py", line 721, in __exit__
    raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError
umesh-timalsina commented 10 months ago

As a temporary solution, we can have debounced calls to state updates as well as properly log the exception.