Closed crjones-amath closed 5 years ago
I'm not adept with the history/restart parts of the code. Pinging @singhbalwinder to request guidance on this.
Update: I've verified that moving add_hist_coord(crm_*,...)
to crm_physics_register
solves our restart problem for the early science branch. The fix is in crjones/crm/restart_fix (relevant diffs here). I will issue a PR with these changes in a few days.
It's great that you have found a fix for this issue. Your fix makes sense to me. Do you know why we need to move these lines form cam_diagnostics? One reason could be that this section of the code is never executed for CRM configurations. Other could be that these lines are executed but not in the desired sequence.
Recent early science runs failed to restart with the following error:
ERROR: set_field_dimensions: mdim size must be > 0
This failure mimics that described in https://github.com/E3SM-Project/E3SM/issues/833. This error has been reproduced on summit at both ne120 and ne4 resolutions when history output includes crm-level output fields with dimensions
crm_nx
,crm_nx_rad
, etc.Hypothesis and possible solution I believe this is related to the crm-specific coordinates not being found when trying to write to history files on restarts.
I think this can be solved by either changing some of the crm(<-- that also fails). Alternatively, the COSP error was solved by moving thepbuf_add_field
calls to point toglobal
instead ofphyspkg
add_hist_coord
calls tophys_register
.If possible, it would be good if the fix would allow us to restart the current ne120 Early Science run without needing to start fresh.