pcdshub / engineering_tools

A repository of scripts, configuration useful for the PCDS team
Other
4 stars 27 forks source link

ENH Add option to specify CNF for LCLS2 hutches. Restart/stop AMI only for LCLS2 Hutches. #216

Closed gadorlhiac closed 1 month ago

gadorlhiac commented 1 month ago

Description

Adds a --cnf option to run_daq_utils.py which is passed to daq_utils.DaqManager. This allows us to indicate which cnf file to use for restarting/stopping the DAQ in LCLS2 hutches.

Currently, this is used for:

NOTE: After this PR the startami and stopami scripts will perform the following depending on hutch:

For AMI2 running in LCLS1 hutches a separate script is needed, e.g. as being introduced in #212

NOTE: stopami was not currently working for LCLS1 hutches (startami was.). This PR introduces a minor fix for that.

For starting and stopping AMI in LCLS2 hutches, a new standard has been introduced where each hutch will have a $HUTCH_ami.py cnf file. This cnf inherits only the AMI processes from the main CNF. This allows it to be used to restart AMI independently of the rest of the DAQ processes.

Examples of this cnf are: RIX:

rix-daq:scripts> ll rix_ami.py
-rw-rw-r-- 1 rixopr xs 103 oct 23 13:43 rix_ami.py
rix-daq:scripts> cat rix_ami.py
from rix import *
from psdaq.slurm.config import Config
config = Config({})
config.extend(procmgr_ami)

TMO:

tmo-daq:scripts> cat tmo_ami.py
from tmo import *
from psdaq.slurm.config import Config
config = Config({})
config.extend(procmgr_ami)

TXI:

txi-daq:scripts> cat txi_ami.py
from txi import *
from psdaq.slurm.config import Config
config = Config({})
config.extend(procmgr_ami)

Motivation and Context

Initially motivated by the need to have a standard way of restarting AMI (and AMI ONLY) across the LCLS2 hutches. This PR's cnf feature also provides a mechanism to allow restarting QRIX and ChemRIX DAQs independently from the same hutch if they are differentiated by different configuration files.

How Has This Been Tested?

Testing startami and stopami in LCLS2 hutches

Tested by starting the DAQ with appropriate paths modified to point to the modified daqutils to ensure that the modified version is called. The specific changes (not included in the PR) are, for example for startami.

--- a/scripts/startami
+++ b/scripts/startami
@@ -39,8 +39,11 @@ if [[ `whoami` != *'opr'* ]]; then
     echo "Please run ami from the operator account!"
     exit
 fi
-
 HUTCH=`get_hutch_name`
+if [ "$(/cds/home/opr/txiopr/scripts/engineering_tools/scripts/daqutils isdaqmgr)" = "true" ]; then
+    /cds/home/opr/txiopr/scripts/engineering_tools/scripts/daqutils --cnf ${HUTCH}_ami.py restartdaq $@
+    exit 0
+fi
 EXPNAME=`get_curr_exp`
 CNFEXT=.cnf

@@ -104,3 +107,4 @@ fi

 echo $ami_path$amicmd
 exec $ami_path$amicmd&

This was needed to prevent picking up the standard/central installation of engineering_tools. There were other alternatives (modifying paths, startup shell scripts for opr scripts etc...)

The checkout of this branch was located at ~txiopr/scripts/engineering_tools for all testing.

This was tested in the LCLS2 hutches running from the $HUTCH-daq machines using the operator account.

Note that the startami scripts pick up different cnf files.

TXI:

txi-daq:scripts> ./startami
DAQ is not running in txi
+ /cds/home/opr/txiopr/git/lcls2_100824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/txi/scripts/txi_ami.py

took 6.5332s. for starting the DAQ
txi-daq:scripts> ./stopami
+ /cds/home/opr/txiopr/git/lcls2_100824/install/bin/daqmgr stop /reg/g/pcds/dist/pds/txi/scripts/txi_ami.py

txi-daq:scripts> ./restartdaq
DAQ is not running in txi
+ /cds/home/opr/txiopr/git/lcls2_100824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/txi/scripts/txi.py

took 2.6325s. for starting the DAQ
txi-daq:scripts> stopdaq
+ /cds/home/opr/txiopr/git/lcls2_100824/install/bin/daqmgr stop /reg/g/pcds/dist/pds/txi/scripts/txi.py

RIX:

rix-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/startami
DAQ is not running in rix
+ /cds/home/opr/rixopr/git/lcls2_101824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/rix/scripts/rix_ami.py

took 3.8405s. for starting the DAQ
rix-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/stopami
+ /cds/home/opr/rixopr/git/lcls2_101824/install/bin/daqmgr stop /reg/g/pcds/dist/pds/rix/scripts/rix_ami.py

rix-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/restartdaq
DAQ is not running in rix
+ /cds/home/opr/rixopr/git/lcls2_101824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/rix/scripts/rix.py

took 2.8847s. for starting the DAQ

TMO:

tmo-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/startami
DAQ is not running in tmo
+ /cds/home/opr/tmoopr/git/lcls2_102224/install/bin/daqmgr restart /reg/g/pcds/dist/pds/tmo/scripts/tmo_ami.py

took 3.4237s. for starting the DAQ
tmo-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/stopami
+ /cds/home/opr/tmoopr/git/lcls2_102224/install/bin/daqmgr stop /reg/g/pcds/dist/pds/tmo/scripts/tmo_ami.py

tmo-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/restartdaq
DAQ is not running in tmo
+ /cds/home/opr/tmoopr/git/lcls2_102224/install/bin/daqmgr restart /reg/g/pcds/dist/pds/tmo/scripts/tmo.py

took 2.9897s. for starting the DAQ

Verifying startami and stopami in LCLS1 hutches

The code affecting LCLS1 hutches was modified to bring it in line with spellcheck requirements. The stopami script was previously not working and now is.

MFX:

mfx-daq:~> ~txiopr/scripts/engineering_tools/scripts/stopami
killing  332
mfx-daq:~> ~txiopr/scripts/engineering_tools/scripts/startami
Do you really intend to restart the ami_client on DAQ is running on mfx-daq? (y/n)y
Restarting the ami_client...
/reg/g/pcds/dist/pds/mfx/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/mfx/scripts/p0.cnf.running' to stop
Current experiment is mfxl1039823
/reg/g/pcds/dist/pds/mfx/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/mfx/scripts/p0.cnf.running' to start
Current experiment is mfxl1039823

XCS:

xcs-daq:~> ~txiopr/scripts/engineering_tools/scripts/stopami
killing  20465
xcs-daq:~> ~txiopr/scripts/engineering_tools/scripts/startami
ldpathmunge: /reg/neh/operator/xcsopr/online/ami_plugins is not a directory
Do you really intend to restart the ami_client on DAQ is running on xcs-daq? (y/n)y
Restarting the ami_client...
/reg/g/pcds/dist/pds/xcs/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/xcs/scripts/p0.cnf.running' to stop
Current experiment is xcsx1015123
/reg/g/pcds/dist/pds/xcs/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/xcs/scripts/p0.cnf.running' to start
Current experiment is xcsx1015123

Verifying restartdaq -C <cnf> and stopdaq

TXI:

txi-daq:scripts> ./restartdaq -C txi_ami.py
DAQ is not running in txi
+ /cds/home/opr/txiopr/git/lcls2_100824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/txi/scripts/txi_ami.py

took 6.6249s. for starting the DAQ

RIX:

NOTE : stopdaq will try and stop rix.py but this is fine since both qrix.py and crix.py are derived from it.

rix-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/restartdaq -C qrix.py
DAQ is not running in rix
+ /cds/home/opr/rixopr/git/lcls2_101824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/rix/scripts/qrix.py
Warning: no qrix_w8_0 found in main_config

took 5.0917s. for starting the DAQ
rix-daq:scripts> ~txiopr/scripts/engineering_tools/scripts/restartdaq -C crix.py
 DAQ is not running in rix
+ /cds/home/opr/rixopr/git/lcls2_101824/install/bin/daqmgr restart /reg/g/pcds/dist/pds/rix/scripts/crix.py

took 3.1730s. for starting the DAQ

(ps-4.6.3) rix-daq:scripts> stopdaq
+ /cds/home/opr/rixopr/git/lcls2_101824/install/bin/daqmgr stop /reg/g/pcds/dist/pds/rix/scripts/rix.py

Verifying restartdaq spellcheck changes haven't affected LCLS1 hutches

MFX:

mfx-daq:~> ~txiopr/scripts/engineering_tools/scripts/restartdaq
DAQ is currently not running
start DAQ on mfx-daq
/reg/g/pcds/dist/pds/mfx/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/mfx/scripts/mfx.cnf' to start
Current experiment is mfxl1039823
and 8.8318 for starting the DAQ
mfx-daq:~> ~txiopr/scripts/engineering_tools/scripts/stopdaq
stop the DAQ from mfx-daq
/reg/g/pcds/dist/pds/mfx/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/mfx/scripts/p0.cnf.running' to stop
Current experiment is mfxl1039823

XCS:

xcs-daq:~> ~txiopr/scripts/engineering_tools/scripts/restartdaq
DAQ is currently not running
start DAQ on xcs-daq
/reg/g/pcds/dist/pds/xcs/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/xcs/scripts/xcs.cnf' to start
Current experiment is xcsx1015123
ERR: no restart message...
and 7.7061 for starting the DAQ
xcs-daq:~> ~txiopr/scripts/engineering_tools/scripts/restartdaq
stop the DAQ on DAQ is running on xcs-daq from xcs-daq
/reg/g/pcds/dist/pds/xcs/current/tools/procmgr/procmgr: using config file '/reg/g/pcds/dist/pds/xcs/scripts/p0.cnf.running' to stop
Current experiment is xcsx1015123

Where Has This Been Documented?

Updated README for behaviour of startami and stopami.

Screenshots (if appropriate):

After stopami in MFX image

After startami again, in MFX image

After stopami in XCS image

After startami again, in XCS image