art-daq / artdaq_daqinterface

Other
0 stars 1 forks source link

DAQInterface should allow overriding the number of launch checks #15

Closed eflumerf closed 2 years ago

eflumerf commented 2 years ago

This issue has been migrated from https://cdcvs.fnal.gov/redmine/issues/25565 (FNAL account required) Originally created by @eflumerf on 2021-02-25 20:55:25


Currently, DAQInterface checks 5 times, with 2 seconds between each check, for the artdaq processes to be launched across the entire system. Sometimes, however, ssh and other issues can cause the system longer to boot than 10 seconds, and it would be nice to have a way to transparently increase the number of "processes launched" checks, and/or the wait between each check.

eflumerf commented 2 years ago

Comment by @eflumerf on 2021-02-25 20:56:47


Implemented on artdaq-utilities-daqinterface:feature/25565_ConfigurableLaunchChecks

eflumerf commented 2 years ago

Comment by @gennadiy-fnal on 2021-02-25 21:36:06


Branch feature/25565_ConfigurableLaunchChecks was tested on the DAB cluster with the with the "max_launch_checks: 11" and "launch_procs_wait_time: 44" options added to the user_settings file; see details in run records for run# 2619 on the DAB cluster.

Launching the artdaq processes
%MSG

%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:36 CST Booted swig_artdaq.cpp:72
Checking that processes are up (check 1 of a max of 11 checks)...
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:37 CST Booted swig_artdaq.cpp:72
found 1 of 4 processes.
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:41 CST Booted swig_artdaq.cpp:72
Checking that processes are up (check 2 of a max of 11 checks)...
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:41 CST Booted swig_artdaq.cpp:72
found 1 of 4 processes.
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:45 CST Booted swig_artdaq.cpp:72
Checking that processes are up (check 3 of a max of 11 checks)...
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:46 CST Booted swig_artdaq.cpp:72
found 1 of 4 processes.
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:50 CST Booted swig_artdaq.cpp:72
Checking that processes are up (check 4 of a max of 11 checks)...
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:51 CST Booted swig_artdaq.cpp:72
found 4 of 4 processes.
%MSG
%MSG-i DAQInterface_partition_1:  DAQ 25-Feb-2021 15:28:51 CST Booted swig_artdaq.cpp:72
All processes appear to be up