E3SM-Ocean-Discussion / E3SM

Ocean discussion repository, for ocean issues and longer-term pull requests for E3SM source code. Please make pull requests that are ready to merge into https://github.com/E3SM-Project/E3SM
https://e3sm.org
Other
1 stars 0 forks source link

Add DISMF to maint-2.1 #92

Closed xylar closed 7 months ago

xylar commented 7 months ago

This ports https://github.com/E3SM-Project/E3SM/pull/5698 to maint-2.1

xylar commented 7 months ago

@darincomeau, I did the cherry-pick of the relevant DISMF commits on maint-2.1. I haven't tested this yet, however. I figured you could at least look over the code and see how it compares with what you had. I'll test as soon as I can but maybe not until next week.

darincomeau commented 7 months ago

Thanks @xylar for taking a look at this. I have a couple tests in the queue now (compute and debug queues, both seem like they'll be there awhile).

  1. We'll want the head of this to be https://github.com/E3SM-Project/E3SM/tree/v2.1-polar-fixes (the intention is to make this an archive tag at some point). I'm not sure if any of the conflicts/issues I was seeing was related to additional changes there, probably not.
  2. I think we'll also want https://github.com/E3SM-Project/E3SM/pull/5910 as part of this. I think this was the one that was giving me trouble, so testing this branch without that PR will hopefully be helpful.

Since Chrysalis is so slammed right now I'll probably wait a day or two to pick this back up, but I'll report back on my tests in the queue once they run.

xylar commented 7 months ago

No problem, I'll rebase.

I would hold off on 5910 until we know this works, as you suggested.

darincomeau commented 7 months ago

SMS_P512.ne30pg2_SOwISC12to60E2r4.CRYO1950-DISMF.chrysalis_intel fails in ocn init with a seg fault (similar to what I was seeing before). Trying in debug mode now.

EDIT: debug error

209: forrtl: severe (408): fort: (7): Attempt to use pointer LANDICEBOUNDARYLAYERTRACERS when it is not associated with a target
xylar commented 7 months ago

@darincomeau, can you provide a full traceback for that error? DISMF should not attempt to use landIceBoundaryLayerTracers.

xylar commented 7 months ago

@darincomeau, I think I found a likely source of the error. The first 2 commits will hopefully address it. But if you see it again (or you seen a new one), please provide a stack trace.

darincomeau commented 7 months ago

Thanks @xylar ! I'm not sure what's going on now, but I'm getting both

crashing in the first land timestep. Traceback:

368: forrtl: error (65): floating invalid
368: Image              PC                Routine            Line        Source
368: libpnetcdf.so.3.0  000015555101668C  for__signal_handl     Unknown  Unknown
368: libpthread-2.28.s  000015554E03BCF0  Unknown               Unknown  Unknown
368: e3sm.exe           0000000004F82CF6  histfilemod_mp_hi        1192  histFileMod.F90
368: e3sm.exe           0000000004F77538  histfilemod_mp_hi        1009  histFileMod.F90
368: e3sm.exe           0000000004CC1F34  elm_driver_mp_elm        1448  elm_driver.F90
368: e3sm.exe           0000000004C2DE69  lnd_comp_mct_mp_l         508  lnd_comp_mct.F90
368: e3sm.exe           00000000004A08A3  component_mod_mp_         751  component_mod.F90
368: e3sm.exe           000000000045E014  cime_comp_mod_mp_        2904  cime_comp_mod.F90
368: e3sm.exe           0000000000488238  MAIN__                    153  cime_driver.F90
368: e3sm.exe           0000000000426522  Unknown               Unknown  Unknown
368: libc-2.28.so       000015554DC9ED85  __libc_start_main     Unknown  Unknown

In the lnd log file it initialized fine, takes one time step and dies; end of log

 hist_htapes_build Successfully initialized elm history files
------------------------------------------------------------
 Successfully initialized the land model
 begin initial run at:
    nstep=            0  year=            1  month=            1  day=
           1  seconds=            0

************************************************************

 dtime_sync=         1800  dtime_elm=         1800  mod =            0
 Beginning timestep   : 0001-01-01_00:00:00
    Completed timestep: 0001-01-01_00:00:00

Not sure what to make of this, trying non-debug now...

xylar commented 7 months ago

@darincomeau, maybe this? https://github.com/E3SM-Project/E3SM/issues/6201

xylar commented 7 months ago

Or: https://github.com/E3SM-Project/E3SM/issues/5665

xylar commented 7 months ago

Those 2 and 4 others that might be related: https://github.com/search?q=repo%3AE3SM-Project%2FE3SM+floating+invalid+histFileMod.F90&type=issues

darincomeau commented 7 months ago

I was just about to report that it's a debug build thing, in optimized mode both CRYO1950 and CRYO1950-DISMF run fine.

darincomeau commented 7 months ago

I cherry-picked 1ad3858128753c5961c1aad199e9fe894d002cad from 5910, had three files with conflicts, one conflict each. Seemed to me to take 'theirs' for each and resolved. The following errors could have been from my merge conflicts, don't recall.

I got build errors, first in mpas_ocn_surface_land_ice_fluxes.F, which I got past with the following change:

@@ -505,14 +505,17 @@ contains

       call mpas_pool_get_array(forcingPool, 'landIceInterfaceTracers', landIceInterfaceTracers)

-      call mpas_pool_get_dimension(forcingPool, &
-                                'index_landIceInterfaceTemperature', &
-                                 indexITPtr)
-      call mpas_pool_get_dimension(forcingPool, &
-                                'index_landIceInterfaceSalinity', &
-                                 indexISPtr)
-      indexIT = indexITPtr
-      indexIS = indexISPtr
+
+       call mpas_pool_get_dimension(forcingPool, 'index_landIceInterfaceTemperature', indexIT)
+       call mpas_pool_get_dimension(forcingPool, 'index_landIceInterfaceSalinity', indexIS)

Next I got a build error in mpas_ocn_forward_mode.F, got past with

@@ -78,7 +78,7 @@ module ocn_forward_mode
    use ocn_gm
    use ocn_submesoscale_eddies
    use ocn_stokes_drift
-   use ocn_manufactured_solution

Smoke test then ran. 🎆

I'm trying a smoke test with a proper hybrid run off historical 701, using CRYOSSP370-DISMF (which I added), and the prescribed 'dismf-from-pismf' file I generated by averaging years 2010-2014 of historical 701. That's built and sitting in the queue. 🤞

darincomeau commented 7 months ago

Ok that proper smoke test ran, so I'm going to go ahead and start the production run, and we can figure out the best way to get this onto v2.1-polar-fixes later.

Thanks @xylar for your help here! It was the landiceboundarylayertracers where I got stuck before.

darincomeau commented 7 months ago

I made a branch on my fork that includes the extra changes here: https://github.com/darincomeau/E3SM/tree/ocn/add-dismf-to-v2.1-polar-fixes

xylar commented 7 months ago

Who "owns" the v2.1-polar-fixes branch on E3SM-Project/E3SM? Can you just push stuff to it without PRs and reviews?

darincomeau commented 7 months ago

I created it, yes we can just push stuff to it without PRs or reviews, since it's all stuff cherry-picked from master (up till now).

xylar commented 7 months ago

Okay, then I think you can just force-push your https://github.com/darincomeau/E3SM/tree/ocn/add-dismf-to-v2.1-polar-fixes to that branch when you're happy.

xylar commented 7 months ago

Shall I close this or is it useful for further discussion?

darincomeau commented 7 months ago

Sounds good, yes I think we can close this.