SCALE-MS / scale-ms

SCALE-MS design and development
GNU Lesser General Public License v2.1
4 stars 4 forks source link

Add exception handling for Raptor Master #375

Open eirrgang opened 1 year ago

eirrgang commented 1 year ago

If the raptor Master task is canceled or processes a stop() early in execution, we can see interrupted RP calls raising exceptions, like

Traceback (most recent call last):
  File "/home/runner/testenv/lib/python3.11/site-packages/scalems/radical/raptor/__main__.py", line 28, in <module>
    sys.exit(raptor())
             ^^^^^^^^
  File "/home/runner/testenv/lib/python3.11/site-packages/scalems/radical/raptor/__init__.py", line 982, in raptor
    _raptor.wait_workers(count=1)
  File "/home/runner/testenv/lib/python3.11/site-packages/radical/pilot/raptor/master.py", line 491, in wait_workers
Error:     raise RuntimeError('wait interrupted by master termination')
RuntimeError: wait interrupted by master termination

We need to prevent the Master task script from exiting exceptionally when we can.

Supports #335

eirrgang commented 1 year ago

Blocked by #377