CICE-Consortium / Icepack

Development repository for sea-ice column physics
Other
25 stars 131 forks source link

driver: icedrv_restart: Added option for restart files to be read/written in NetCDF format #427

Closed davidclemenssewall closed 1 year ago

davidclemenssewall commented 1 year ago

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium, please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

apcraig commented 1 year ago

This looks interesting. Several comments.

Good to see CPP for USE_NETCDF and appropriate aborts.

davidclemenssewall commented 1 year ago

Thanks @apcraig! Comments in line below.

This looks interesting. Several comments.

* warnstr and icepack_warnings_add should not be in icepack_intfc.F90.  Errors in the icepack driver should be written directly to the icepack driver output unit number (normally nu_diag).

Ah gotcha. Would you mind pointing me to a spot in the code where a warning is added like that? I.e., where we just provide a warning not an abort. Alternatively I'm open to this error causing an abort since the user did ask for a restart file that wasn't written (even if the code itself otherwise ran correctly).

* I am a little concerned that bgc is not included here.  If it's not possible to include that now, we should have a plan to update that soon.

I agree that it would be good to include bgc. I'm not sure that I can justify spending time on it to my PI because we don't need bgc for the project that's funding me but perhaps if it's quick to do.

* This seems to include some of the changes to the netcdf time axis.  We'll need to be careful not to redo/undo/conflict with [driver: icedrv_history: fix start time for NetCDF history output #426](https://github.com/CICE-Consortium/Icepack/pull/426).  I assume you created this branch from that branch?  We'll want to merge [driver: icedrv_history: fix start time for NetCDF history output #426](https://github.com/CICE-Consortium/Icepack/pull/426) first, then have you update your branch and fix/remove the time axis changes in this PR.

Correct. This is from the branch in pull request #426. I figured I would send this pull request now so we can start talking about what needs to be fixed in it. Then once #426 is merged into the consortium main I'll merge it into this branch (before taking car

* In icedrv_restart, the binary and netcdf writing of fields is not done the same way.  For binary, subroutines are called like write_restart_age, write_restart_FY, etc.  For netcdf, you have inlined the fields that are read/written in those other subroutines.  This is dangerous as the "same" implementation is now split into two places and is more likely to diverge.  The extra subroutines should be called in both cases and those subroutines need to be modified to support both binary and netcdf in the same way as aicen, vicen, etc.

Your concern makes sense. I'll take a look at how to do this efficiently.

* I wonder if we should change history_cdf to history_format?  Or maybe we should just have a logical restart_cdf.  I think it would be nice if the namelist to control history and restart format be the same.

I agree that it would be nice if they were the same. CICE uses restart_format and history_format so I would vote for changing history_cdf to history_format.

* We should think about including the new test suite in io_suite.ts and I think we may want to have a test that has history=netcdf with restart=binary and history=binary with restart=netcdf.

That sounds good to me. Do we already have tests that validate history output? Or would that test just verify that icepack completes a run with history output indicated?

Good to see CPP for USE_NETCDF and appropriate aborts.

eclare108213 commented 1 year ago

I would also vote for using the same namelist flags and options/values for history and restart as in CICE. Less confusing.

apcraig commented 1 year ago

If you look in icedrv_init.F90, you'll see various examples of writing and/or aborting directly.

         call icedrv_system_abort(string=subname//'ERROR: open file '// &
            trim(nml_filename), &
            file=__FILE__, line=__LINE__)
      if (ncat == 1 .and. kitd == 1) then
         write (nu_diag,*) 'Remapping the ITD is not allowed for ncat=1.'
         write (nu_diag,*) 'Use kitd = 0 (delta function ITD) with kcatbound = 0'
         write (nu_diag,*) 'or for column configurations use kcatbound = -1'
         call icedrv_system_abort(file=__FILE__,line=__LINE__)
      endif
         write (nu_diag,*) 'WARNING: ITD required for ncat > 1'
         write (nu_diag,*) 'WARNING: Setting kitd and kcatbound to default values'

Depending where you are in the code and what the error is, you may want to write to either nu_diag or nu_diag+n-1. The latter is to write to each column's output file. The former is the general output file.

Lets change the namelist to history_format and restart_format, are you able to do that?

We do not have tests that formally validate history output. We just check the model completes fine.

Thanks.

davidclemenssewall commented 1 year ago

Thank you for the examples. I've fixed the warnings, merged in the updated consortium: main, changed history_cdf to history_format, and updated the documentation. I'm about halfway through reorganizing icedrv_restart as you suggested. I should be able to finish that sometime tomorrow.

Do you want me to move the new test suite into io_suite.ts and add in the mixed history/restart format options?

dabail10 commented 1 year ago

The compile is failing here. The variable nchar needs to be declared in the subroutine restartfile.

dabail10 commented 1 year ago

Actually, it's in several other subroutines. It should probably be a module variable.

apcraig commented 1 year ago

With regard to testing, this is what I'd do. set_env.ionetcdf can stay as is. Rename set_nml.ionetcdf to set_nml.histcdf and it would be just "history_format = 'nc'". Then add set_nml.restcdf which would be just "restart_format = 'nc'".

The I would remove netcdf_nobgc_suite.ts and then add the following to io_suite.ts,

restart        col     1x1        debug,ionetcdf,histcdf
smoke          col     1x1        run1year,diag1,ionetcdf,histcdf
restart        col     1x1        debug,ionetcdf,restcdf
smoke          col     1x1        run1year,diag1,ionetcdf,restcdf
restart        col     1x1        debug,ionetcdf,histcdf,restcdf
smoke          col     1x1        run1year,diag1,ionetcdf,histcdf,restcdf

I think this is adequate but if you also want to add a couple other cases to test exact restart of other physics (like netcdf_nobgc_suite.ts does), that's fine too. In general, we can't test every option with each other as we'd end up with millions of tests. So in this case, we're testing the default physics with various combinations of binary and netcdf history and restart to cover this feature.

By the way, noticed in the current PR, you still need to add restart_format to ug_case_settings.rst and icepack_index.rst. Didn't do a thorough review yet, just noticed that so I'll mention it. Thanks!

davidclemenssewall commented 1 year ago

Sounds good. I have finished reorganizing the code, changing the namelists and tests, etc. This most recent version passes all tests in base_suite, io_suite, the nobgc netcdf restart suite (which I've now removed), and the CICE quick_suite. I think that I've updated the documentation in the correct places, but let me know if I've missed anything there. I'll be out of the office for the next two weeks so if any further fixes are needed I'll get to them after then. Thanks!

davidclemenssewall commented 1 year ago

The automated testing is failing with this error: ./icepack.setup: ERROR, /Users/runner/icepack/configuration/scripts/options/set_[nml,env].restcdf# not found

Which is confusing to me because set_nml.restcdf is present (and worked in my ubuntu container).

apcraig commented 1 year ago

I think the CI is failing because you may need a return character at the end of the io_suite.ts file. It looks like the CI env is adding a # there for reasons I don't understand. But I suspect returning to a new line will fix that.

The other thing I suggest is setting the default value of history_format to "none". Also, please add checks in icedrv_init.F90 to check for valid values for history_format and restart_format. Those valid values should be "['none','cdf'] and ['bin','cdf'] at this point I believe. Finally, the documentation of these two namelist variables in doc/source/user_guide/ug_case_settings.rst should follow the style for namelist with limited options,

   "``history_format``", "``cdf``", "history file output in netcdf format", "``none``"
   "","``none``","no history output",""

   "``restart_format``", "``bin``", "restart file output in binary format", "``bin``"
   "","``cdf``","restart file output in netcdf format",""

I will run some tests now just to check things are working as expected too. Thanks!

apcraig commented 1 year ago

I created a PR, https://github.com/davidclemenssewall/Icepack/pull/1, based on testing on cheyenne.

apcraig commented 1 year ago

@davidclemenssewall, have you had a chance to review my proposed changes? It would be good to get this PR merged. Thanks.

davidclemenssewall commented 1 year ago

@apcraig sorry for my delay. I'm still digging out from stuff that built up during the mosaic conference and the polar climate working group. All of the changes look good and thank you for fixing my errors (I just started learning fortran recently). I also checked that the base_suite and io_suite passed all tests on Ubuntu.

apcraig commented 1 year ago

@davidclemenssewall, no problem. just trying to keep things moving along. I will run a final test myself then will merge. Thanks for all your efforts, great to have new people contributing.