OFFIS-DAI / mango

Modular Python-based agent framework to implement multi-agent systems
https://mango-agents.readthedocs.io/
MIT License
12 stars 2 forks source link

Error with on_stop task with multiprocessing #133

Open maurerle opened 5 days ago

maurerle commented 5 days ago

I got this message for a RoleAgent running in as_agent_process which has an on_stop handler which might have taken too long.

Stacktrace I got:

ERROR:mango.container.mp:The Send Message Task Loop has failed!
Traceback (most recent call last):
  File "/env/mango/container/mp.py", line 334, in _send_to_message_pipe
    await tx.drain()
  File "/env/mango/util/multiprocessing.py", line 184, in drain
    await self._stream_writer.drain()
  File "/env/lib/python3.12/asyncio/streams.py", line 392, in drain
    await self._protocol._drain_helper()
  File "/env/lib/python3.12/asyncio/streams.py", line 166, in _drain_helper
    raise ConnectionResetError('Connection lost')
ConnectionResetError: Connection lost
ERROR:mango.container.mp:The Send Message Task Loop has failed!

I don't quite know how this happened, but want to document this issue. I tried to create a minimal example, but it did not occur in there.

rcschrg commented 4 days ago

I can see this happening. I guess we should do a two-phase shutdown. First, we shutdown the agents in the mirror containers and main container; the main container gets notified when they shut down. After we shutdown the connection, then we terminate the process.