Closed plasorak closed 1 month ago
I'm not sure if I understand what I should see with these changes...
Without them, I see drunc-controller processes hang around for about 30 seconds after drunc exits when I run an interactive DAQ session on daq.fnal.gov.
WIth them, the drunc-controller processes hang around for less than 10 seconds, but they are still there when drunc exits.
Am I looking at the wrong thing?
If not, shouldn't success be indicated by no drunc-controller processes running when the drunc-interactive-shell exits?
You are not, on the np04 cluster, they still exist for around 2 seconds.
I don't think it's trivial to add a check to make sure there is not process when drunc exits, the process manager sends sighup to the processes when it exits, but it does not track their PID.
Thanks for the update.
I think that it's very important to not leave processes hanging around when run control exits. Should I file a separate Issue for that?
Yeah, I think so, this is quite a bit more complicated than what I envisaged.
This PR corrects a typo in the
data_type
for theRunControlMessage
. More importantly, it changes the behaviour if we can't reach the connectivity server onretract
: if that's the case (meaning the connectivity server has probably been killed before the controller), we abort.Fixes https://github.com/DUNE-DAQ/drunc/issues/204, again