ucgmsim / slurm_gm_workflow

Porting the GM workflow to run on new NeSI HPC (Maintainer: Jonney)
MIT License
0 stars 2 forks source link

Fix fault/event name extraction in presence of many _s #530

Closed sungeunbae closed 2 weeks ago

sungeunbae commented 3 weeks ago

@capajaro had a case where INSTALL_REALISATION step correctly produces sim_params.yaml, but constantly marked as failed.

I noticed that his event name had too many underscores. One of his median events (his events have no other realisations), for example, is 3366146_Atzori_et_al_2012_Multi

When a job is submitted,

sbatch  --export=CUR_ENV,CUR_HPC -M mahuika  /scale_wlg_persistent/filesets/project/nesi00213/Environments/cesar/workflow/workflow/automation/org/nesi/install_realisation.sl 3366146_Atzori_et_al_2012_Multi /scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/cpa148/200m_Sel_Event_Specific_Multi

the .sl script processes two inputs.

 13 REL_NAME=${1:?REL_NAME argument missing}
 14 SIMULATION_ROOT=${2:?SIMULATION_ROOT argument missing}
 15 
 16 FAULT=$(echo $REL_NAME | cut -d"_" -f1)
 17 SIM_DIR=$SIMULATION_ROOT/Runs/$FAULT/$REL_NAME
 18 CH_LOG_FFP=$SIM_DIR/ch_log
 19 

which are

REL_NAME = 3366146_Atzori_et_al_2012_Multi
SIMULATION_ROOT = /scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/cpa148/200m_Sel_Event_Specific_Multi

then due to line 16, FAULT=3366146, where we expect it to be 3366146_Atzori_et_al_2012_Multi

As a result,

SIM_DIR= /scale_wlg_nobackup/filesets/nobackup/nesi00213/RunFolder/cpa148/200m_Sel_Event_Specific_Multi/Runs/3366146/3366146_Atzori_et_al_2012_Multi

and it won't be able to find $SIM_DIR/sim_params.yaml, hence this step is marked as "failed"

This PR allows the event/fault name to contain underscores. It will strip off the _REL?? bit to obtain the event/fault name