ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
295 stars 300 forks source link

Branch for CESM3 development -- CESM3_dev #2202

Closed ekluzek closed 8 months ago

ekluzek commented 9 months ago

We decided this at our last CTSM software meeting. And also brought it up in the CSEG meeting last week as well. This will be a longish standing branch that will keep track of CESM3 development in it for the CESM3 coupled simulations. This will allow there to be a tag to CTSM rather than having to track source mods.

Turn on: excess ice; meier roughness Will be up-to-date with main and ctsm 5.2 We will make sure there is the latest CTSM5.2 f09 and ne30np4.pg3 dataset in it. Should be on ESCOMP fork Branch name: CESM3_dev

So branch tags will be something like...

branch_tags/CESM3_dev.n01_ctsm5.1.dev143

olyson commented 9 months ago

Thanks @ekluzek . I'll add a bit of information that might be useful. Currently, the CESM3 dev simulations are using an ne30 dataset: /glade/work/slevis/git/mksurfdata_toolchain/tools/mksurfdata_esmf/surfdata_ne30np4.pg3_SSP5-8.5_78pfts_CMIP6_1850-2100_c230227.nc

SourceMods, which are related to the new surface dataset, are here (compatible with ctsm5.1.dev130):

/glade/p/cesmdata/cseg/runs/cesm2_0/b.e23_alpha16b.BLT1850.ne30_t232.053/SourceMods/src.clm/clm_varpar.F90 /glade/p/cesmdata/cseg/runs/cesm2_0/b.e23_alpha16b.BLT1850.ne30_t232.053/SourceMods/src.clm/surfrdMod.F90

And we've been using an initial file generated from a coupler history spinup:

use_init_interp = .true. finidat='/glade/p/cgd/tss/people/oleson/CLM5_restarts/ctsm51_cesm23a14a_ne30pg3ne30pg3mg17_CPLHIST_1850pAD.clm2.r.0561-01-01-00000.nc'

ekluzek commented 9 months ago

@olyson I've started the branch. And made it update to ctsm5.1.dev143, so it will be updated from the ctsm5.1.dev130 version. But, I think that's OK, let me know if not. It will have those source mods in it, because it starts from the CTSM5.2 branch.

ekluzek commented 9 months ago

@olyson should we point to that initial condition file, or let the CESM3 dev case do that part?

olyson commented 9 months ago

I was thinking that it would be good to get that initial condition out of the box, since I don't think we'll necessarily be doing a new coupler history spinup anytime soon. Unless it's difficult to get that out of the box in the specific compset/res the CESM3 dev simulations are using. The most recent ones are using "--compset BLT1850_v0c --res ne30pg3_t232".

ekluzek commented 8 months ago

I ran on Cheyenne for a test case fine. Since, the CESM3 dev simulations will be run on Derecho, I started setting it up to test there.

SMS_Ln9.ne30pg3_t232.I1850Clm51BgcCrop.cheyenne_intel.clm-clm51cam6LndTuningMode PASS

On Derecho sent the following (had to send by hand as run_sys_tests wasn't working on Derecho.

ERI_D.ne30pg3_t232.I1850Clm51BgcCrop.derecho_intel.clm-clm51cam6LndTuningMode
ERP_D_Ld9.ne30pg3_t232.I1850Clm51BgcCrop.derecho_intel.clm-clm51cam6LndTuningMode
SMS_Lm1.ne30pg3_t232.I1850Clm51BgcCrop.derecho_intel.clm-clm51cam6LndTuningMode

I ran into the following issue, which likely might be a problem on the ctsm5.2 branch as well.

dec2463.hsn.de.hpc.ucar.edu 462: forrtl: severe (408): fort: (2): Subscript #1 of the array LNDMASK_GLOB has value 25385 which is greater than the upper bound of 21600
dec2463.hsn.de.hpc.ucar.edu 462:
dec2463.hsn.de.hpc.ucar.edu 462: Image              PC                Routine            Line        Source
dec2463.hsn.de.hpc.ucar.edu 462: cesm.exe           0000000000996054  lnd_set_decomp_an         546  lnd_set_decomp_and_domain.F90
dec2463.hsn.de.hpc.ucar.edu 462: cesm.exe           000000000098B832  lnd_set_decomp_an         128  lnd_set_decomp_and_domain.F90
dec2463.hsn.de.hpc.ucar.edu 462: cesm.exe           000000000095F28E  lnd_comp_nuopc_mp         645  lnd_comp_nuopc.F90
dec2463.hsn.de.hpc.ucar.edu 462: libesmf.so         0000147BD4738DA9  callVFuncPtr             2167  ESMCI_FTable.C
dec2463.hsn.de.hpc.ucar.edu 462: libesmf.so         0000147BD4737DE8  ESMCI_FTableCallE         824  ESMCI_FTable.C
dec2463.hsn.de.hpc.ucar.edu 462: libesmf.so         0000147BD4BC8B72  enter                    2318  ESMCI_VMKernel.C

Hopefully, simple to fix...

ekluzek commented 8 months ago

Note, that there are 5 tests for ne30 grids in ctsm5.1.dev143, one of which is for ne30np4.pg3

SMS_D_Ld1.ne30pg3_t061.I1850Clm50BgcSpinup.cheyenne_intel.clm-cplhist

There might be an issue with the number of processors for the ne30np4.pg3 case that causes above?

ekluzek commented 8 months ago

So the abort was due to using the ctsm5.1 surface dataset which was in by accident. This means we need to add a graceful abort if the updated datasets aren't being used. I'll create an issue about that.

In the testing I found that excess-ice shows a difference in answers on processor count. So I'll file an issue on that as well.

samsrabin commented 8 months ago

CTSM SE meeting today: @ekluzek says this should probably be closed.

wwieder commented 8 months ago

Seems like this should be closed now, by #2203.

ekluzek commented 2 months ago

As a review of this work it looks like I spent about 37 hours of work on this branch. It was important for bringing in ctsm5.2.0 datasets to cesm2_3_beta16. And the need for the branch went away with the ctsm5.2.0 tag.