Closed guoqing-noaa closed 1 month ago
CI Update on Wcoss2 at 05/29/24 06:28:14 PM
============================================
Cloning and Building global-workflow PR: 2627
with PID: 254221 on host: clogin01
Automated global-workflow Testing Results:
Machine: Wcoss2
Start: Wed May 29 18:50:37 UTC 2024 on clogin01
---------------------------------------------------
Build: Completed at 05/29/24 07:03:35 PM
Case setup: Completed for experiment C48_ATM_2aad9b3e
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_2aad9b3e
Case setup: Skipped for experiment C48_S2SWA_gefs_2aad9b3e
Case setup: Completed for experiment C48_S2SW_2aad9b3e
Case setup: Completed for experiment C96_atm3DVar_extended_2aad9b3e
Case setup: Skipped for experiment C96_atm3DVar_2aad9b3e
Case setup: Skipped for experiment C96_atmaerosnowDA_2aad9b3e
Case setup: Completed for experiment C96C48_hybatmDA_2aad9b3e
Case setup: Skipped for experiment C96C48_ufs_hybatmDA_2aad9b3e
Experiment C48_ATM FAILED on Hercules with error logs:
/work2/noaa/stmp/CI/HERCULES/2627/RUNTESTS/COMROOT/C48_ATM_2aad9b3e/logs/2021032312/gfsatmos_prod_f009-f015.log
Follow link here to view the contents of the above file(s): (link)
Experiment C48_ATM FAILED on Hercules in
/work2/noaa/stmp/CI/HERCULES/2627/RUNTESTS/C48_ATM_2aad9b3e
logfile looks like it was cut off mid-execution. Not even a SIGTERM. The part we have looks fine. May just retry Hercules.
Log on disk is complete. Ends like this:
End interp_atmos_sflux.sh at 19:42:42 with error code 0 (time elapsed: 00:00:01)
+ exglobal_atmos_products.sh[190]: export err=0
+ exglobal_atmos_products.sh[190]: err=0
+ exglobal_atmos_products.sh[190]: err_chk
completed cleanly
+ exglobal_atmos_products.sh[193]: IFS=:
+ exglobal_atmos_products.sh[193]: read -ra grids
+ exglobal_atmos_products.sh[194]: for grid in "${grids[@]}"
+ exglobal_atmos_products.sh[195]: prod_dir=COM_ATMOS_GRIB_1p00
+ exglobal_atmos_products.sh[196]: /bin/cp -p sflux_f015_1p00 /work2/noaa/stmp/CI/HERCULES/2627/RUNTESTS/COMROOT/C48_ATM_2aad9b3e/gfs.20210323/12//products/atmos/grib2/1p00/gfs.t12z.flux.1p00.f015
+ exglobal_atmos_products.sh[197]: wgrib2 -s sflux_f015_1p00
+ exglobal_atmos_products.sh[202]: [[ YES == \Y\E\S ]]
+ exglobal_atmos_products.sh[203]: grp=
+ exglobal_atmos_products.sh[204]: (( FORECAST_HOUR > 0 & FORECAST_HOUR <= FHMAX_WGNE ))
+ exglobal_atmos_products.sh[206]: wgrib2 /work2/noaa/stmp/CI/HERCULES/2627/RUNTESTS/COMROOT/C48_ATM_2aad9b3e/gfs.20210323/12//products/atmos/grib2/0p25/gfs.t12z.pgrb2.0p25.f015 -d 597 -grib /work2/noaa/stmp/CI/HERCULES/2627/RUNTESTS/COMROOT/C48_ATM_2aad9b3e/gfs.20210323/12//products/atmos/grib2/0p25/gfs.t12z.wgne.f015
*** FATAL ERROR: record 597 not found ***
+ exglobal_atmos_products.sh[1]: postamble exglobal_atmos_products.sh 1717011684 8
+ preamble.sh[70]: set +x
End exglobal_atmos_products.sh at 19:42:42 with error code 8 (time elapsed: 00:01:18)
+ JGLOBAL_ATMOS_PRODUCTS[1]: postamble JGLOBAL_ATMOS_PRODUCTS 1717011674 8
+ preamble.sh[70]: set +x
End JGLOBAL_ATMOS_PRODUCTS at 19:42:42 with error code 8 (time elapsed: 00:01:28)
+ atmos_products.sh[1]: postamble atmos_products.sh 1717011391 8
+ preamble.sh[70]: set +x
End atmos_products.sh at 19:42:43 with error code 8 (time elapsed: 00:06:12)
Doesn't look like anything that would be related to this PR.
I'm separately getting errors from cron, so I think hercules is just having some issues right now.
Experiment C48_ATM_2aad9b3e SUCCESS on Wcoss2 at 05/29/24 09:42:12 PM
Experiment C48_S2SW_2aad9b3e SUCCESS on Wcoss2 at 05/29/24 09:48:13 PM
Experiment C96C48_hybatmDA_2aad9b3e SUCCESS on Wcoss2 at 05/29/24 10:27:19 PM
CI Passed Hera at
Built and ran in directory /scratch1/NCEPDEV/global/CI/2627
CI Passed Orion at
Built and ran in directory /work2/noaa/stmp/CI/ORION/2627
Experiment C96_atm3DVar_extended_2aad9b3e SUCCESS on Wcoss2 at 05/30/24 04:18:29 AM
All CI Test Cases Passed on Wcoss2:
Experiment C48_ATM_2aad9b3e *** SUCCESS *** at 05/29/24 09:42:12 PM
Experiment C48_S2SW_2aad9b3e *** SUCCESS *** at 05/29/24 09:48:13 PM
Experiment C96C48_hybatmDA_2aad9b3e *** SUCCESS *** at 05/29/24 10:27:19 PM
Experiment C96_atm3DVar_extended_2aad9b3e *** SUCCESS *** at 05/30/24 04:18:29 AM
CI Passed Hercules at
Built and ran in directory /work2/noaa/stmp/CI/HERCULES/2627
Thanks, @WalterKolczynski-NOAA @aerorahul @DavidHuber-NOAA
Description
Add the capability to use slurm reservation nodes Add "ACCOUNT_SERVICE" for jobs to run in PARTITION_SERVICE
Resolves #2626
Type of change
Change characteristics
How has this been tested?
Checklist