DUNE-DAQ / drunc

Dune RUN Control (DRUNC) is the run control for the DUNE experiment
1 stars 1 forks source link

Restart using SSH #50

Closed TiagoTAlves closed 6 months ago

TiagoTAlves commented 6 months ago

When restarting a process using ssh, the process says:

[11:15:30] INFO     "Broadcast": 'DRUNC_EXCEPTION_RAISED' 'DruncCommandException' exception thrown: Process f9f8ee93-fda6-49fb-b7ea-e8dcafcd98e3 already exists!                                        kafka_stdout_broadcast_handler.py:83
           ERROR    "process_manager_driver": 'DruncCommandException' exception thrown: Process f9f8ee93-fda6-49fb-b7ea-e8dcafcd98e3 already exists!                                                                       shell_utils.py:97
           INFO     "Broadcast": 'SUBPROCESS_STATUS_UPDATE' Process 'ru-controller' (session: 'test-session', user: 'titavare') process exited with exit code 255                                       kafka_stdout_broadcast_handler.py:83

The process then just dies. image

TiagoTAlves commented 6 months ago

If you then try and restart again now that the process is dead, it gives the exact same error:

[11:18:22] INFO     "Broadcast": 'DRUNC_EXCEPTION_RAISED' 'DruncCommandException' exception thrown: Process f9f8ee93-fda6-49fb-b7ea-e8dcafcd98e3 already exists!                                        kafka_stdout_broadcast_handler.py:83
           ERROR    "process_manager_driver": 'DruncCommandException' exception thrown: Process f9f8ee93-fda6-49fb-b7ea-e8dcafcd98e3 already exists!                                                                       shell_utils.py:97
           INFO     "Broadcast": 'SUBPROCESS_STATUS_UPDATE' Process 'ru-controller' (session: 'test-session', user: 'titavare') process exited with exit code 255                                       kafka_stdout_broadcast_handler.py:83