ufs-community / ufs-weather-model

UFS Weather Model
Other
134 stars 243 forks source link

regional_atmaq_debug_intel is failing due to time limit on Hera #2377

Open FernandoAndrade-NOAA opened 1 month ago

FernandoAndrade-NOAA commented 1 month ago

Description

The regional_atmaq_debug_intel test is consistently failing due to time limits now. Increasing from the 45 minute time limit to 1hr is showing to be insufficient on Hera.

To Reproduce:

Additional context

Output

   3895   0: AQM: Advancing from 2019-08-01T12:18:00 to 2019-08-01T12:21:00
   3896   0:
   3897   0:
   3898   0:      Timestep written to CTM_DRY_DEP_1    for date and time  2019213:122100
   3899   0:
   3900   0:      Timestep written to CTM_WET_DEP_1    for date and time  2019213:122100
   3901   0:         Processor 0000 is in darkness at  2019213:121800 GMT - no photolysis
   3902   0:
   3903   0:      Timestep written to CTM_VIS_1        for date and time  2019213:122100
   3904   0:
   3905   0:      Timestep written to CTM_PMDIAG_1     for date and time  2019213:122100
   3906   0:
   3907   0:      Timestep written to CTM_AOD_1        for date and time  2019213:122100
   3908   0:
   3909   0:      Timestep written to CTM_AVIS_1       for date and time  2019213:122100
   3910   0:  in atmos_model update, fhzero=   6.00000000000000      fhour=  0.3500000
   3911   0:   0.0000000E+00
   3912   0: PASS: fcstRUN phase 2, n_atmsteps =                6 time is         6.237005
   3913 _______________________________________________________________
FernandoAndrade-NOAA commented 1 month ago

@BrianCurtis-NOAA @zach1221 @jkbk2004 FYI

climbfuji commented 1 month ago

I noticed this in my chunked-arrays PR as well

zach1221 commented 1 month ago

@FernandoAndrade-NOAA I tested this again a few times on Hera and didnt have any issues with timeouts. Can you try running this case again?

FernandoAndrade-NOAA commented 1 month ago

@FernandoAndrade-NOAA I tested this again a few times on Hera and didnt have any issues with timeouts. Can you try running this case again?

I'm not having a failure anymore with this test on Hera with the latest develop, it may have been some longer than usual degradation in performance.