ESCOMP / CMEPS

NUOPC Community Mediator for Earth Prediction Systems
https://escomp.github.io/CMEPS/
22 stars 78 forks source link

Add trigrid capability - ability to run atm and lnd on separate grids #470

Closed mvertens closed 3 months ago

mvertens commented 3 months ago

Description of changes

Adds capability to run the atm and lnd components on different grids

Specific notes

In particular see new notes at the top level of med_fraction_mod.F90 in this PR.

Contributors other than yourself, if any: @billsacks

CMEPS Issues Fixed (include github issue #):

Are changes expected to change answers? bfb as long as atm/lnd are on the same grid, more substantial otherwise (simulations are being carried out for NorESM to validate this)

Any User Interface Changes? No

Testing performed

Using a sandbox that started with cesm3_0_alpha01a and

 ccs_config at tag ccs_config_cesm0.0.109
                 share at tag share1.0.19
s                 cime cime6.0.250 67ade4a156ca53f89e118be6a272aef375d0fd08 is out of sync with .gitmodules cime6.0.246
                   mct at tag MCT_2.11.0
            mpi-serial at tag MPIserial_2.5.0
                   cam at tag cam6_3_162
                   ww3 at tag ww3i_0.0.2
                   rtm at tag rtm1_0_78
                pysect at tag 3.2.2
s               mosart mosart1.1.02 e2ffe00004cc416cfc8bcfae2a949474075c1d1f is out of sync with .gitmodules feature/cism2mosart
             mizuroute at tag cesm-coupling.n02_v2.1.2_rme02
                   fms at tag fi_240516
            parallelio at tag pio2_6_2
s                cdeps cdeps1.0.37 b4b3ac5bba287d1420657e47e4bf3ee329c4594d is out of sync with .gitmodules cdeps1.0.36
s                cmeps cmeps0.14.70 b2cb575e0ef382e7d3c5bebc87fcb423dff09ce3 is out of sync with .gitmodules feature/cism2mosart
                  cice at tag cesm_cice6_4_1_10_rme01
s                 cism cismwrap_2_1_99-43-g3433977 34339777afa7b810fd4822fce1039826668b33e4 is out of sync with .gitmodules feature/cism2mosart_cesm

Verified that the following 2 tests worked were bfb when the trigrid branch was substituted for cmeps0.14.70

jedwards4b commented 3 months ago

It looks like this PR contains your other feature/cism2mosart - that makes it difficult to review independently.

mvertens commented 3 months ago

I agree. This is why I made it a draft PR. Once the feature/cism2mosart is accepted and I also do the testing - we can remove the draft status.

billsacks commented 3 months ago

I haven't looked at the detailed changes here, but support these changes based on discussions with @mvertens . @mvertens please let me know if you'd like my more careful review of any pieces of this.

DeniseWorthen commented 3 months ago

@mvertens I'm happy to review this, but I need this PR https://github.com/ESCOMP/CMEPS/pull/452 reviewed/merged before I can get ufs working w/ the current main.

jedwards4b commented 3 months ago

@DeniseWorthen merged #452

DeniseWorthen commented 3 months ago

@jedwards4b Thanks. I'll test the trigrid PR now.

mvertens commented 3 months ago

@denise - thanks so much for reviewing this.

DeniseWorthen commented 3 months ago

@mvertens I'm getting a runtime error in med_internalstate_init at

    call NUOPC_CompAttributeGet(gcomp, name='mesh_lnd', value=lnd_mesh_name, rc=rc)
    if (ChkErr(rc,__LINE__,u_FILE_u)) return

which gives me

 PET000 med_internalstate_mod.F90:247 Invalid argument  - Passing error in return code
jedwards4b commented 3 months ago

Can you provide the entire PET log for that error please? And if you ran on derecho a pointer to the case directory. Thanks

DeniseWorthen commented 3 months ago

Here is the rundir on derecho

/glade/derecho/scratch/worthen/FV3_RT/trigrid/cpld_control_nowave_noaero_p8_intel

jedwards4b commented 3 months ago

It looks like mesh_lnd is expected as a new input variable - but there is no definition in the namelist xml files. @mvertens?

jedwards4b commented 3 months ago

@DeniseWorthen I think it should be defined in ufs.configure

DeniseWorthen commented 3 months ago

In UFS, we don't usually run land as a separate component. I wonder if we need a 'if comp_present(complnd)'.

mvertens commented 3 months ago

@jedwards4b @DeniseWorthen - sorry about that. I fixed this on betzy - but did not push it back and now betzy is down. Let me fix it here.

mvertens commented 3 months ago

@jedwards - mesh_lnd is defined in cime_config/namelist_definition_drv.xml:

That said - I need to think about how to handle this in UFS.
mvertens commented 3 months ago

@DeniseWorthen - I've just pushed a change back. Can you please try again.

DeniseWorthen commented 3 months ago

@mvertens We also don't have a mesh for ATM, so I made a fix like so

-    character(len=CL)          :: atm_mesh_name
+    character(len=CL)          :: atm_mesh_name=''

to give it a default name. This might not be the best fix.

But then I got a subsequent error in prep_atm

forrtl: severe (408): fort: (2): Subscript #1 of the array FLDNAMES_TO_OCN has value 4 which is greater than the upper bound of 3

Image              PC                Routine            Line        Source
fv3.exe            00000000112060BB  med_phases_prep_a         214  med_phases_prep_atm_mod.F90

edit: I'm doing my testing in debug. Changing len(fldnames_to_ocn) to size(fldnames_to_ocn,1) works (but why?).

mvertens commented 3 months ago

@DeniseWorthen - I've pushed back a change that hopefully addresses the problem you ran into. Could you please give your testing another try?

DeniseWorthen commented 3 months ago

@mvertens I'm still getting errors (in debug)

forrtl: severe (194): Run-Time Check Failure. The variable 'med_internalstate_mod_mp_med_internalstate_init_$ISPRESENT_LND' is being used in '/glade/work/worthen/ufs_trigrid/CMEPS-interface/CMEPS/mediator/med_internalst\
ate_mod.F90(254,41)' without being defined
mvertens commented 3 months ago

@DeniseWorthen - I'm retesting now and I'll fix this. It's unfortunate that I cannot run the UFS test - so I really appreciate your doing this!

mvertens commented 3 months ago

@DeniseWorthen - can you please test again.

DeniseWorthen commented 3 months ago

@mvertens The run-time error is now

forrtl: severe (408): fort: (2): Subscript #1 of the array FLDNAMES_FROM_OCN has value 6 which is greater than the upper bound of 5

Image              PC                Routine            Line        Source
fv3.exe            0000000011206FF3  med_phases_prep_a         215  med_phases_prep_atm_mod.F90
mvertens commented 3 months ago

@DeniseWorthen - I have pushed a fix back. Please try again.

DeniseWorthen commented 3 months ago

@mvertens Thanks. Everything is now running at least. Our GNU platforms are unavailable today, so I'll have to wait until tomorrow to do a full test. Is that OK?

mvertens commented 3 months ago

@jedwards4b - I'm not sure we were ready to merge yet. @DeniseWorthen still has to do her GNU platform test. For now let's keep it and issue a bug report if @DeniseWorthen finds problems.

jedwards4b commented 3 months ago

That was my thought too. I still have an instance of wav2ocn_smap at line 2991 of esmFldsExchange_cesm_mod.F90 though.

DeniseWorthen commented 3 months ago

Sorry for delay in testing. The platform (hercules) had an unexpectedly long downtime. I've created an associated issue #475.