ufs-community / ufs-weather-model

UFS Weather Model
Other
134 stars 243 forks source link

for coupled or datm regression tests, compare only coupler restart files and ATM forecast files #571

Closed DeniseWorthen closed 2 years ago

DeniseWorthen commented 3 years ago

Description

Currently the coupled and datm regression tests compare the restarts for all components. For the configurations which rely on the CMEPS mediator, comparison of the coupler restart file should be sufficient. If any component is not B4B with the baseline, the coupler restart file will not be B4B.

This will reduce the time for baseline comparisons, the time required to move a new baseline into position and the size of the baseline directories.

Solution

The LIST_FILES for the coupled and datm regression tests can be edited to remove the FV3, MOM6, CICE6 and WW3 restart files for comparison.

pjpegion commented 3 years ago

How do we confirm that some PR does not break the output in the history files?

DeniseWorthen commented 3 years ago

In the regression tests, we don't compare the history files for any coupled component except the ATM (the forecast files). We test only for reproducibility of the restart files. We could break the history files for ocn/ice now and not be aware of it.

junwang-noaa commented 3 years ago

I'd suggest keeping the comparison of both history files and restart files as they may be written out using different methods. So just comparing history or restart files does not guarantee both files are written out correctly. I think we need to reduce the size of the output files.

junwang-noaa commented 2 years ago

@DeniseWorthen Do we still need this?

DeniseWorthen commented 2 years ago

For the mediator, ice and ocean we compare only the restart files. We do not compare any history files. For those components, we are already in the situation that any PR may unknowingly change history files.

The initial impetus for creating the issue is that the mx025 MOM6 restart files are quite large and take a long time to write. The size impacts the baseline directory size we maintain. The time impacts the length of each coupled RT.

MOM6 restart files for the mx025 are quite large. And, if you watch as the model finalizes, there is a large lag between when the model starts to finalize and when it actually completes. That is when the MOM6 restarts are written.

I believe that we could compare only the mediator restarts safely as a replacement for writing and comparing MOM6 and CICE6 restart files. We do not need the restart files in the baselines anymore to actually run a restart test, because we now use checkpoint restarts and dependent runs for all coupled restart tests. But it involves some compromise to our current testing philosophy.

The issue can be closed if other code managers feel the risk outweighs the benefit.

DeniseWorthen commented 2 years ago

Closing, undesired to implement at this time.