ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
306 stars 310 forks source link

Create new surface datasets, CTSM5.2 branch #1903

Closed wwieder closed 7 months ago

wwieder commented 1 year ago

Includes surface dataset project #1873, and #1868, as well as significant changes to mksurfdata_esmf (e.g. #1791) gross unrepresented land use transition #309, dynamic urban datasets #1157, and more...

The next step waits for merge of #2318. Details in this card on the project board.

slevis-lmwg commented 1 year ago

@ekluzek you asked about the space that the current list of files takes. I typed du -h in /glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0 and got 727G

wwieder commented 1 year ago

Along with this, @slevis-lmwg or @olyson were you going to prepare a list of all the grids, resolutions, & time periods we're supporting (along with their approximate file size). I need some kind of list to bring to co-chairs to have a discussion on file creation and storage.

Second question, I though part of the feature of the new mksurfdataESMF was that we'd have the ability to do mapping 'on the fly' or at runtime, especially for high resolution or variable resolution grids. Is this still a feature of the system that we should highlight?

slevis-lmwg commented 1 year ago

@wwieder @ekluzek @olyson 1) I met with @ekluzek yesterday. We updated the list of grids/periods in this file, keeping in mind that this list is still incomplete and does not include file size info: /glade/scratch/slevis/CTSM/tools/mksurfdata_esmf/mksurfdata_jobscript_multi_master I can update this list into a table with file sizes that @wwieder can take to the co-chairs for discussion. 2) I have generated the corresponding datasets from this list and moved them here: /glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0 3) Today I met with @olyson and we discussed the logic of HIST versus SSP compsets. Our understanding @ekluzek is that currently SSP compsets by default start in 2015 rather than 1850. So Keith was suggesting that we generate one 1850-2000 timeseries rather than one per SSP. Is this default changing @ekluzek to allow for continuous simulations from 1850-2100? If so, then we do generate 1850-2100 for all the SSPs, PLUS @olyson suggested adding a separate 1850-2000 timeseries not associated with any SSP to avoid confusion.

slevis-lmwg commented 1 year ago

On the second question, the answer is no, such a feature does not exist at this time. My understanding is that for files and timeseries that take up huge amounts of disk space, we will not make the files ahead of time. Rather they will be generated on an as needed basis.

slevis-lmwg commented 1 year ago

Current ctsm5.2 list of surface and landuse datasets with file sizes: /glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/ls_dash_lhq_star_nc_c230530.asc

ekluzek commented 1 year ago

Let's talk about this some in our software meeting tomorrow. It would be good to be on the same page about this in terms of what datasets to create. Part of the reason to only produce a 1850-2100 landuse timeseries file for most resolutions is that you can use it for both historical, for future scenarios, AND for present day where you need data after 2015. This means you have one file that can be used for those three different purposes and you don't have to have a different file for each.

@wwieder I think the answer for you is "yes" the mapping is done on the fly in mksurfdata_esmf, whereas mksurfdata_map required you to create mapping files first as a separate step that you do before running mksurfdata_map. Hence, the CTSM5.2 branch removes the mkmapdata script from the tools directory. All that's needed to support a new grid to run CTSM at is a mesh file for that grid. Does that answer the question you are asking?

slevis-lmwg commented 1 year ago

On the second question, the answer is no, such a feature does not exist at this time. My understanding is that for files and timeseries that take up huge amounts of disk space, we will not make the files ahead of time. Rather they will be generated on an as needed basis.

@wwieder my response differs from Erik's because I assumed you meant "at runtime" as in during a CTSM simulation.

slevis-lmwg commented 1 year ago

Update after this morning's SE meeting (please let me know if you catch errors):

New ctsm5.2 list of surface and landuse datasets with file sizes: /glade/p/cesmdata/cseg/inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/ls_dash_lhq_star_nc_c230601.asc

slevis-lmwg commented 1 year ago

@ekluzek

In the cheyenne test-suite, this test fails ERI_D_Ld9.T31_g37.I2000Clm50Sp.cheyenne_intel.clm-SNICARFRC because we do not generate fsurdat for T31. Is it ok with you if I change this test to run with f10 or one of our other grids?

Answer is YES to f10: I changed the grid in testlist_clm.xml for this test.

Regarding the 4x5 grid:

Answers: 1) Keep 4x5 16-pft for FATES: I updated the testmods directory. I updated gen_mksurfdata_jobscript_multi.py and mksurfdata_jobscript_multi_master. I'm generating the 4x5 landuse file. I will need to move it to .../inputdata/... 2) For non-FATES, it's ok to change to f10: I changed the grid in testlist_clm.xml for these tests.

This test fails SMS_Ln9_P72x2.C96_C96_mg17.IHistClm50BgcCrop.cheyenne_intel.clm-clm50cam6LndTuningMode because we do not generate a landuse file for C96. We generate C96 fsurdat for 1850 and 2000. Should I be generating the landuse file, too?

Answer is YES, generate landuse timeseries for C96: I updated gen_mksurfdata_jobscript_multi.py and mksurfdata_jobscript_multi_master. I'm generating the C96 landuse file. I will need to move it to .../inputdata/...

These tests fail SMS_Lm13_PS.f19_g17.I2000Clm51BgcCrop.cheyenne_intel.clm-cropMonthOutput PFS_Ld10_PS.f19_g17.I2000Clm50BgcCrop.cheyenne_intel LII_D_Ld3_PS.f19_g17.I2000Clm50BgcCrop.cheyenne_intel.clm-default because the number of landunits has changed from what is found in the clmi file. Adding use_init_interp = .true. to the namelist should fix this. Is that acceptable, or do we need to generate a new clmi file?

Answer is YES, use_init_interp = .true. for now, and later we will generate new finidat files for these tests.

A couple more tests fail, but I will leave them for later.

slevis-lmwg commented 1 year ago

@ekluzek in today's meeting we discussed C96 for 1850-2015 because we have a test for that period, so I'm generating the landuse file. A follow-up question is whether I should generate C96 files for the SSPs, too, even though we do not have corresponding tests.

ekluzek commented 1 year ago

@slevis-lmwg I recommend that for resolutions that we create historical landuse timeseries files, that we create SSP5-8.5 and allow users to use that one file for both historical and SSP5-8.5 simulations.

We should do all of the SSP scenarios for only a limited set of resolutions. f10, 1-deg and 2-deg for sure for example.

slevis-lmwg commented 1 year ago

@ekluzek I have now merged #2016, so I think you should be able to proceed with the makefile work. Let me know if you need anything from me.

slevis-lmwg commented 7 months ago

Last checkbox checked in this issue.

See https://github.com/ESCOMP/CTSM/projects/36 for project status. See https://github.com/ESCOMP/CTSM/pull/2372 for ctsm5.2.mksurfdata branch / PR status.