E3SM-Project / ACME-ECP

E3SM MMF for DoE ECP project
Other
9 stars 1 forks source link

srf_flux_avg only works if enabled at nstep=0 #85

Open whannah1 opened 5 years ago

whannah1 commented 5 years ago

Currently we are reliant on temporal surface flux smoothing (srf_flux_avg=1) for stability when using the CRM, and we don't have an adequate explanation. The situation was improved by some bug fixes which allows runs to go farther without this smoothing, but the problem persists and so we need to make sure this option is working as we intend it to.

Some recent GPU runs got 6 months in before crashing with negative layer thickness errors, which I realized were due to accidentally turning this option off. However, when I tried to correct the issue by setting srf_flux_avg=1 the runs would not restart. I don't want to restart these runs and lose that 6 months of data.

I'm pretty sure this is because the residual variables that hold the surface flux residuals in the pbuf were never properly initialized. I'm not sure how we could properly fix this, since we don't want to reinitialize these variables each time the model starts. It would be nice to develop a smart way of checking if surface flux was on in previous submissions on not.