payu-org / payu

A workflow management tool for numerical models on the NCI computing systems
Apache License 2.0
19 stars 26 forks source link

CICE calendar issues in ESM1.5 simulations #457

Closed blimlim closed 3 weeks ago

blimlim commented 3 months ago

Payu's handling of the cice calendar is a bit confusing, especially around leap years. It would be good to clarify how the calendar is controlled, and whether we should make any changes. I'll note down here what I've understood so far - please add any clarifications, or ideas that you have!

What I've had the most trouble understanding is the inconsistencies between the cice_in.nml and input_ice.nml configuration files, and the way that cice.py and access.py in payu handle them during the setup stage.

cice.py:

cice.py starts by setting the experiment initialisation date and calendar type using control-dir/ice/cice_in.nml: https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/cice.py#L159-L165

With the current version of cice_in.nml in the esm1.5 configs, this sets it to a 365 day calendar:

&setup_nml
    days_per_year  = 365
  , year_init      = 0001

It then reads istep0 and npt from restart-dir/cice_in.nml to calculate the amount of time already simulated since the initialisation date:

https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/cice.py#L211-L215

and uses the result to set the new simulation's starting date. This start date, plus the runtime specified in the config.yaml file, are used to calculate the new simulation's run duration (in seconds, using a 365 day calendar).

https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/cice.py#L227-L240

Finally it writes the number of time steps corresponding to this duration to npt in work-dir/ice/cice_in.nml:

https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/cice.py#L242-L245

Because of the 365 day calendar setting, cice.py always sets npt to 8760. E.g. for a four year simulation starting at year 101, the values produced are:

output000/ice/cice_in.nml:    npt = 8760.0
output001/ice/cice_in.nml:    npt = 8760.0
output002/ice/cice_in.nml:    npt = 8760.0
output003/ice/cice_in.nml:    npt = 8760.0

access.py

access.py repeats similar calculations using input_ice.nml.

It first reads the calendar type and experiment initialisation date from control-dir/ice/input_ice.nml, which sets caltype = 1, corresponding (I think?) to a proleptic gregorian calendar.

Using the runtime0 and runtime values in restart-dir/ice/input_ice.nml, it calculates the start date of the new simulation

https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/access.py#L138-L143

and then the run duration of the new simulation using the config.yaml settings:

https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/access.py#L150-L157

all using the proleptic gregorian calendar.

The previous simulation length, new start date, and new simulation length are then written to work-dir/ice/input_ice.nml:

https://github.com/payu-org/payu/blob/ef55e93fe23fcde19024479c0dc4112dcdf6603f/payu/models/access.py#L161-L164

which takes leap years into account:

output000/ice/input_ice.nml:    runtime = 31536000
output001/ice/input_ice.nml:    runtime = 31536000
output002/ice/input_ice.nml:    runtime = 31536000
output003/ice/input_ice.nml:    runtime = 31622400

Along with the different calculations, the versions of cice_in.nml and input_ice.nml in the initial pre-industrial restarts in /g/data/vk83/experiments/inputs/access-esm1p5/pre-industrial/restart/ice don't match up

cice_in.nml
----------------
&setup_nml
istep0=0,
npt=0,
input_ice.nml
----------------
&coupling
runtime0=3155673600
runtime=0

and so cice.py does its calendar calculations using a start date of 0001-01-01, while access.py uses 0101-01-01. This doesn't seem to cause issues as cice.py uses the 365 day calendar anyway and doesn't write the calculated start date anywhere. We could edit cice_in.nml to have the correct prior run length, though that might make it more confusing as istep0 would have to be set according to a 365 day calendar, and wouldn't line up with runtime0's value which uses the proleptic Gregorian calendar.


With these inconsistencies, it would be great to understand which information the cice model is actually using during simulations.

From a quick look at the cice code, I thought it might be using npt to control the duration.

In year 104, the model stops writing to the ice.diag.d file after 365 days with diagfreq set to 1 in cice_in.nml:

istep1:     35035    idate:   1041230    sec:     68400
istep1:     35036    idate:   1041230    sec:     72000
istep1:     35037    idate:   1041230    sec:     75600
istep1:     35038    idate:   1041230    sec:     79200
istep1:     35039    idate:   1041230    sec:     82800
istep1:     35040    idate:   1041231    sec:         0

however hourly model output is saved right up to the end of the year:

ncdump -t -v time iceh_01h.0104-12.nc
...
    "0104-12-31 10:58:7.500000", "0104-12-31 12", 
    "0104-12-31 13:01:52.500000", "0104-12-31 13:58:7.500000", 
    "0104-12-31 15", "0104-12-31 16:01:52.500000", 
    "0104-12-31 16:58:7.500000", "0104-12-31 18", 
    "0104-12-31 19:01:52.500000", "0104-12-31 19:58:7.500000", 
    "0104-12-31 21", "0104-12-31 22:01:52.500000", 
    "0104-12-31 22:58:7.500000", "0105-01-01" ;

and so it does appear to find the correct 366 run length somewhere.

blimlim commented 2 months ago

I think I have a better idea now of how Cice uses the conflicting cice_in.nml and input_ice.nml namelists produced by payu.

Cice only uses the stop_now condition, which is controlled by the value of npt, for non-ACCESS simulations: https://github.com/ACCESS-NRI/cice4/blob/e7549ebd2044690a432cc67c1317c81cb194b750/drivers/access-cm/CICE_RunMod.F90#L334C1-L344C42

      timeLoop: do

         call ice_step

         istep  = istep  + 1    ! update time step counters
         istep1 = istep1 + 1
         time = time + dt       ! determine the time and date
         call calendar(time)    ! at the end of the timestep

         if (stop_now >= 1) exit timeLoop

For ACCESS simulations, I think it uses the coupling information to control the length of the ice time loop instead.

https://github.com/ACCESS-NRI/cice4/blob/e7549ebd2044690a432cc67c1317c81cb194b750/drivers/access-cm/CICE_RunMod.F90#L135-L331

#ifdef AusCOM
      time_sec = 0
      ...
      DO icpl_ai = 1, num_cpl_ai   !begin A <==> I coupling iterations

      Do icpl_io = 1, num_cpl_io   !begin I <==> O coupling iterations
      ...
        do itap = 1, num_ice_io   ! cice time loop within each i<=>o cpl interval 
      ...
           call ice_step

so for ESM1.5, the number of ice timesteps should be num_cpl_ai * num_cpl_io * num_ice_io, and these values appear to be calculated from the input_ice.nml namelist – in particular using the runtime variable set by payu's access.py

https://github.com/ACCESS-NRI/cice4/blob/e7549ebd2044690a432cc67c1317c81cb194b750/drivers/access-cm/cpl_parameters.F90#L194-L196

num_cpl_ai = runtime/dt_cpl_ai
num_cpl_io = dt_cpl_ai/dt_cpl_io
num_ice_io = dt_cpl_io/dt_cice

E.g. we'll have the total number of cice timesteps as num_cpl_ai * num_cpl_io * num_ice_io = runtime/dt_cice, and so modifying npt in cice_in.nml (as in the above comment) won't affect the simulation length, but modifying runtime in input_ice.nml will.


There's still the question of why the ice_diag.d logs stop getting lines like istep1: 35040 idate: 1041231 sec: 0 written to it after npt time steps. These lines are written in the calendar subroutine, but only when the stop_now condition is false

https://github.com/ACCESS-NRI/cice4/blob/e7549ebd2044690a432cc67c1317c81cb194b750/source/ice_calendar.F90#L379-L384

      if (my_task == master_task .and. mod(istep,diagfreq) == 0 &
                                 .and. stop_now /= 1) then
        write(nu_diag,*) ' '
        write(nu_diag,'(a7,i10,4x,a6,i10,4x,a4,i10)') &
             'istep1:', istep1, 'idate:', idate, 'sec:', sec
      endif

so although the simulation length is controlled by runtime, npt controls how long these lines are written for.

blimlim commented 3 weeks ago

Closing this issue following #484