E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
354 stars 368 forks source link

Potential bug in mpas-seaice initialization module #6132

Open zhangshixuan1987 opened 11 months ago

zhangshixuan1987 commented 11 months ago

With Maint-2.1 (Hash: https://github.com/E3SM-Project/E3SM/commit/0f362f469f16b4374aaedda88977593b45fb8090), I've been trying to run an AMIP simulation to test the data assimilation on the atmospheric component of E3SM. In the simulation, I modified the "eam.i" files every 6 hours and then employed the "hybrid run" to let EAM model digest the changes I made for "eam.i" files, and then integrate the model into the next cycle (i.e. 6hour forecast). However, the mpas-seaice model crashed with the following errors in "log.seaice.0001.err" file:

Beginning MPAS-seaice Error Log File for task       1 of      24
    Opened at 2023/12/28 03:01:12
----------------------------------------------------------------------

ERROR: Invalid DateTime string (invalid time substring) 2011-01-15_06:00:000:00:00

As the string "2011-01-15_06:00:000:00:00" are in a strange format, which motivated me to check the setups in my namelist file "mpassi_in" :

&seaice_model
 config_calendar_type = 'gregorian'
 config_dt = 7200.0
 config_num_halos = 2
 config_run_duration = '00-00-01_00:00:00'
 config_start_time = '2011-01-15_06:00:00'
 config_stop_time = 'none'

Here, the "config_start_time = '2011-01-15_06:00:00' " was set with a format following the mpas-seaice user guide , which should be correct. Then, I checked the code "driver/ice_comp_mct.F" and found the following code blocks:

        ! Setup start time. Will be over written later when clocks are synchronized
        call mpas_pool_get_config(domain % configs, "config_start_time", tempCharConfig)
        tempCharConfig = trim(tempCharConfig) // "0:00:00"

(see https://github.com/E3SM-Project/E3SM/blob/a48104d9f62055a8987a7cfd2a829b01a654fbb2/components/mpas-seaice/driver/ice_comp_mct.F#L432C1-L434C59)

Here, "0:00:00" is appended to the "tempCharConfig" that is assigned by "config_start_time" namelist variable. Link to my case, as I set "config_start_time = '2011-01-15_06:00:00' ", the resulting time string becomes "2011-01-15_06:00:000:00:00" due to the above code blocks. To me, the appended part "0:00:00" is redundant and unnecessary as the user guide has clearly suggested user that the possible value for config_start_time is ’YYYY-MM-DD HH:MM:SS’ or ’file’, and b) "0:00:00".

Therefore, I suspect that the above code block in "driver/ice_comp_mct.F" has a bug, and the code could be possibly changed to "tempCharConfig = trim(tempCharConfig)" or the whole line can be commented out. If this is indeed a bug, a similar issue also exists in the current E3SM master branch

xylar commented 11 months ago

The "blame" feature suggests that this code was changed by @jonbob in: https://github.com/E3SM-Project/E3SM/commit/843b5122c32224a31f75f63fa733a6232211e382 @jonbob, it looks like previously there was a hard-coded date and time, and you modified it to use the date (but not the time). Any memory of why it was done that way?

jonbob commented 11 months ago

@xylar -- no, I don't remember why we would have done that. But it was many years ago... I looked and it seems like we were getting start times from the setup scripts that looked like '1850-01-01_0' at that point. Anyway, I'll do some testing and come up with a solution soon

jonbob commented 11 months ago

OK, I think I understand what is happening -- and have a possible solution. First, for E3SM the mpas components get their config_start_time settings from the coupler/system using code blocks like:

if ($CONTINUE_RUN eq 'TRUE') {
        add_default($nl, 'config_start_time', 'val'=>"'file'");
} else {
        add_default($nl, 'config_start_time', 'val'=>"'${RUN_STARTDATE}_${START_TOD}'");
}

where RUN_STARTDATE and START_TOD are defined in the env_run.xml file:

  <entry id="RUN_STARTDATE" value="0001-01-01">
      <type>char</type>
      <desc>
      Run start date (yyyy-mm-dd). Only used for startup or hybrid runs.
    </desc>
    </entry>
    <entry id="START_TOD" value="0">
      <type>integer</type>
      <desc>
      Run start time-of-day
    </desc>

Since the model components need to be running with the same calendar, users are expected to make changes to the system settings and not individual component namelists, which I'm guessing is what @zhangshixuan1987 did in this case. If you change env_run.xml to look like:

    <entry id="RUN_STARTDATE" value="2011-01-15">
      <type>char</type>
      <desc>
      Run start date (yyyy-mm-dd). Only used for startup or hybrid runs.
    </desc>
    </entry>
    <entry id="START_TOD" value="6">
      <type>integer</type>
      <desc>
      Run start time-of-day
    </desc>

then the corresponding setting for MPASSI is automatically changed to:

 config_start_time = '2011-01-15_6'

which looks strange but matches its typical use in E3SM and works just fine, plus also matches the start times for all other components. So I think this is functional but a bit ugly, since it does not really match the mpas time format. In any case, users should not change the start time for individual components but rely on E3SM to keep them all consistent.

That said, we could modify the mpas components so that the ":00:00" gets appended by the scripts that build the namelists instead of in the driver Fortran files. Then the mpassi_in file would have:

 config_start_time = '2011-01-15_6:00:00'

and the code could be removed from ice_comp_mct.F

However, I'm hesitant to make NML changes to the E3SM codebase before the next tag -- unless this feels critical.

jonbob commented 10 months ago

@zhangshixuan1987 -- have you tried using the env_run.xml settings instead?

zhangshixuan1987 commented 10 months ago

@jonbob: Sorry that I missed the notification from GitHub regarding your previous reply above, resulting a delayed response to your questions. The way you described is indeed the case, i.e. if we changed the env_run.xml to look like:


    <entry id="RUN_STARTDATE" value="2011-01-15">
      <type>char</type>
      <desc>
      Run start date (yyyy-mm-dd). Only used for startup or hybrid runs.
    </desc>
    </entry>
    <entry id="START_TOD" value="6">
      <type>integer</type>
      <desc>
      Run start time-of-day
    </desc>

then the corresponding setting for MPASSI is automatically changed to:

config_start_time = '2011-01-15_6'

However, the issue is that this namelist variable "config_start_time" was then passed to the following code section in the code "driver/ice_comp_mct.F":

        ! Setup start time. Will be over written later when clocks are synchronized
        call mpas_pool_get_config(domain % configs, "config_start_time", tempCharConfig)
        tempCharConfig = trim(tempCharConfig) // "0:00:00"

Then the resulted tempCharConfig = trim(config_start_time) = '2011-01-15_6'//"0:00:00" = "2011-01-15_60:00:00" instead of "2011-01-15_06:00:00". Note that "2011-01-15_60:00:00" is not correct here.

I believe that the current way built-in E3SM model will ONLY work properly if the model restart time is away "0000 UTC", In this way, env_run.xml will issue a date look like config_start_time = 'yyyy-mm-dd_0', and the code block in current driver/ice_comp_mct.F will append "0:00:00" string and obtain a final time stamp in "yyyy-mm-dd_00:00:00" format correctly. However, if the restart time is 0600UTC, 1200UTC, 1800UTC, then resulted time stamp will not be correct, triggering the issues I encountered above.

Please let me know if you think this is the case or I still have some misunderstanding.

jonbob commented 10 months ago

@zhangshixuan1987 -- I'm not sure the mpas components are working as you expect. The config_start_time in the namelist is not used by the code in e3sm. But you're also correct that setting

    <entry id="START_TOD" value="6">
      <type>integer</type>
      <desc>
      Run start time-of-day
    </desc>

will not work, but it seems to be wrong for all components. As a test I created a fully-active case and changed START_TOD to "6" and got this in the ice log:

 Initial time 0001-01-01_00:00:06
 === Completed ice_init_mct ===
 0001-01-01_00:30:06  WC time:            79.790

But all the components picked up that same initial time -- this is from the ELM log:

 dtime_sync=         1800  dtime_elm=         1800  mod =            0
 Beginning timestep   : 0001-01-01_00:00:06

So I think the START_TOD needs to be seconds, like the RUN_REFTOD setting. I ran this again with

   <entry id="START_TOD" value="21600">

and got this in the ice log:

 Applying ICE coupling dt (s) of: **
 Initial time 0001-01-01_06:00:00
 === Completed ice_init_mct ===
 0001-01-01_06:30:00  WC time:           491.320

as well as in the logs of the other components. I think this is the behavior you want?

zhangshixuan1987 commented 10 months ago

@jonbob: I think that I may have some misunderstanding on my own. The solution you proposed are reasonable and correct. I will adjust my strategy and try to follow your way to setup the model. Thank you for the patient explanation for my questions!

jonbob commented 10 months ago

Thanks @zhangshixuan1987. Please let me know if it works for you