dessn / Pippin

Pipeline for photometric SN analysis
MIT License
9 stars 10 forks source link

Unknown new error when submitting sim stage jobs #138

Closed rebeccachen0 closed 11 months ago

rebeccachen0 commented 1 year ago

Error reproduced at $PIPPIN_OUTPUT/RC_DEBUG -- I've run the same configuration previously with no error

Running submit_batch_jobs.sh on the .input file in the SIM directory submits and runs successfully, which is why I believe it's a Pippin-related issue?

OmegaLambda1998 commented 1 year ago

@RickKessler Pippin just runs submit_batch on the file produces in the 1_SIM directory, so I've no idea why Pippin running submit_batch will crash, but running submit_batch by itself won't, do you have any ideas?

OmegaLambda1998 commented 1 year ago

The error in question:

 23 [   ERROR |           manager.py:492]       Excerpt:    Traceback (most recent call last):
 22 [   ERROR |           manager.py:492]       Found error in file /scratch/midway2/rkessler/PIPPIN_OUTPUT/RC_DEBUG/1_SIM/DES_P21_HOSTEFF/PIP_RC_DEBUG_DES_P21_    HOSTEFF.LOG, excerpt below
 21 [   ERROR |           manager.py:492]       Excerpt:    FATAL ERROR ABORT :
 20 [   ERROR |           manager.py:492]       Excerpt:
 19 [   ERROR |           manager.py:492]       Excerpt:
 18 [   ERROR |           manager.py:492]       Excerpt:
 17 [   ERROR |           manager.py:492]       Excerpt:       `|```````|`
 16 [   ERROR |           manager.py:492]       Excerpt:       <| o\ /o |>
 15 [   ERROR |           manager.py:492]       Excerpt:        | ' ; ' |
 14 [   ERROR |           manager.py:492]       Excerpt:        |  ___  |     ABORT submit on Fatal Error.
 13 [   ERROR |           manager.py:492]       Excerpt:        | |' '| |
 12 [   ERROR |           manager.py:492]       Excerpt:        | `---' |
 11 [   ERROR |           manager.py:492]       Excerpt:        \_______/
 10 [   ERROR |           manager.py:492]       Found error in file /scratch/midway2/rkessler/PIPPIN_OUTPUT/RC_DEBUG/1_SIM/DES_P21_HOSTEFF/PIP_RC_DEBUG_DES_P21_    HOSTEFF.LOG, excerpt below
  9 [   ERROR |           manager.py:492]       Excerpt:    FATAL ERROR ABORT :
  8 [   ERROR |           manager.py:492]       Excerpt:       Unable to find NGENTOT_RATECALC: key in SIMnorm_PIP_RC_DEBUG_DES_P21_HOSTEFF_SNIaMODEL0.LOG ;
  7 [   ERROR |           manager.py:492]       Excerpt:       LOG created from sim normalization commands :
  6 [   ERROR |           manager.py:492]       Excerpt:         cd /scratch/midway2/rkessler/PIPPIN_OUTPUT/RC_DEBUG/1_SIM/DES_P21_HOSTEFF/LOGS ;  \
  5 [   ERROR |           manager.py:492]       Excerpt:         snlc_sim.exe sn_ia_salt2_bs20_des5yr.input  \
  4 [   ERROR |           manager.py:492]       Excerpt:         INIT_ONLY 1 DNDZ POWERLAW 2.27E-5 1.7     GENMAG_OFF_GLOBAL -0.12     GENMAG_SMEAR 1e-06     GE    NMAG_SMEAR_MODELNAME C11     GENMAG_SMEAR_SCALE 0.0001     GENMAG_SMEAR_SCALE\(c\) 0,0     GENMODEL $DES_ROOT/SALT3training/OUT_TRAIN_SALT3_systCovar/SALT3.    MODEL000+LAMEXTEND     GENPDF_FILE $DES5YR/populations/FINAL_forDES5yr/DES5YR_S3P21_GENPDF.DAT     GENPDF_OPTMASK 1     GENPEAK_SALT2ALPHA 0.145     HOSTLIB    _MSKOPT 2     HOSTLIB_SCALE_PROPERTY_ERR 0.0\(LOGMASS\),0.0\(LOGSFR\),0.0\(LOGsSFR\),0.0\(COLOR\)     HOSTLIB_WGTMAP_FILE $DES_USERS/mvincenzi/MYPIPPIN/sims    _instrument/WGT_maps_DESX3/DES_WGTMAP_MassSFR_Wiseman2021.HOSTLIB     OPT_MWCOLORLAW 89     OPT_MWEBV 3       PATH_USER_INPUT /scratch/midway2/rkessler/PIPP    IN_OUTPUT/RC_DEBUG/1_SIM/DES_P21_HOSTEFF   \
  3 [   ERROR |           manager.py:492]       Excerpt:         > SIMnorm_PIP_RC_DEBUG_DES_P21_HOSTEFF_SNIaMODEL0.LOG \
  2 [   ERROR |           manager.py:492]       Excerpt:       Crashed while preparing batch jobs.
  1 [   ERROR |           manager.py:492]       Excerpt:       Check Traceback
256 [   ERROR |           manager.py:492]       FAILED: SNANASimulation DES_P21_HOSTEFF task (wall time 0:00:02, 50 jobs, deps [])
  1 [   DEBUG |            config.py:200]   Did not chown /scratch/midway2/rkessler/PIPPIN_OUTPUT/RC_DEBUG/RC_DEBUG.log
RickKessler commented 1 year ago

The SIMSED models have a KCOR_FILE mis-match: FATAL ERROR ABORT called by read_SIMSED_TABBINARY Binary file KCOR_FILE: '/project2/rkessler/SURVEYS/PS1MD/USERS/dscolnic/PANTHEON+/kcor/v6_1/kcor_DES_5yr_v6_1.fits' but current KCOR_FILE: '/project2/rkessler/PRODUCTS/SNDATA_ROOT/kcor/DES/DES-SN3YR/kcor_DECam.fits'

For SIMSED models, change your KCOR files to the private ones under /PANTHEON+. When cosmology paper goes into CWR we will release all these file in public locations to hopefully avoid these conflicts.

rebeccachen0 commented 1 year ago

Adding KCOR_FILE: /project2/rkessler/SURVEYS/PS1MD/USERS/dscolnic/PANTHEON+/kcor/v6_1/kcor_DES_5yr_v6_1.fits to the config doesn't seem to fix it -- still getting the same error

RickKessler commented 1 year ago

Looking in PIP_RC_DEBUG_DES_P21_HOSTEFF.LOG,

snlc_sim.exe: error while loading shared libraries: libCore.so: cannot open shared object file: No such file or directory

what machine did you log into ?

rebeccachen0 commented 1 year ago

Hm. This is all on Midway2

OmegaLambda1998 commented 1 year ago

@RickKessler @rebeccachen0 Has this progressed at all?

OmegaLambda1998 commented 11 months ago

It seems like this got resolved with some recent SNANA update