zauberzeug / rosys

An all-Python robot system based on web technologies. The purpose is similar to ROS, but it's based on NiceGUI and easier to use for mobile robotics.
https://rosys.io
MIT License
78 stars 10 forks source link

`pynma2.parse` can break the process pool #162

Closed rodja closed 2 months ago

rodja commented 3 months ago

When using pynmea2 to parse GNSS messages, we experienced a severe problem.

async def test():
    import pynmea2
    await run.cpu_bound(pynmea2.parse, 'FOOBAR')

rosys.on_repeat(test, 1)

This code breaks the process pool:

Traceback (most recent call last):
  File "/Users/rodja/Projects/rosys/rosys/rosys.py", line 158, in _repeat
    await invoke(self.handler)
  File "/Users/rodja/Projects/rosys/rosys/helpers/__init__.py", line 37, in invoke
    result = await result
             ^^^^^^^^^^^^
  File "/Users/rodja/Projects/rosys/test.py", line 37, in test
    await run.cpu_bound(pynmea2.parse, 'FOOBAR')
  File "/Users/rodja/Projects/nicegui/nicegui/run.py", line 48, in cpu_bound
    return await _run(process_pool, callback, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rodja/Projects/nicegui/nicegui/run.py", line 31, in _run
    return await loop.run_in_executor(executor, partial(callback, *args, **kwargs))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 2729, in uvloop.loop.Loop.run_in_executor
  File "/opt/homebrew/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 791, in submit
    raise BrokenProcessPool(self._broken)
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
finished garbage collection
error in "test"
Traceback (most recent call last):
  File "/Users/rodja/Projects/rosys/rosys/rosys.py", line 158, in _repeat
    await invoke(self.handler)
  File "/Users/rodja/Projects/rosys/rosys/helpers/__init__.py", line 37, in invoke
    result = await result
             ^^^^^^^^^^^^
  File "/Users/rodja/Projects/rosys/test.py", line 37, in test
    await run.cpu_bound(pynmea2.parse, 'FOOBAR')
  File "/Users/rodja/Projects/nicegui/nicegui/run.py", line 48, in cpu_bound
    return await _run(process_pool, callback, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rodja/Projects/nicegui/nicegui/run.py", line 31, in _run
    return await loop.run_in_executor(executor, partial(callback, *args, **kwargs))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 2729, in uvloop.loop.Loop.run_in_executor
  File "/opt/homebrew/Cellar/python@3.11/3.11.7_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/process.py", line 791, in submit
    raise BrokenProcessPool(self._broken)
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

We could circumvent it by not using run.cpu_bound for the parsing. I could not reproduce it in tests, but the combination of run.cpu_bound and pynma2.parse also resulted in random shutdowns of RoSys.

rodja commented 3 months ago

I fixed the issue in https://github.com/zauberzeug/nicegui/pull/2234. But instead of doing the same in RoSys I would rather like we implement #43.

rodja commented 3 months ago

Explanation: pynma2 has a ParseError taking two arguments derived from ValueError which has only one argument. It seems that pickling/unpickling these configurations break the ProcessPool.