Closed sbrus89 closed 2 weeks ago
@matthewhoffman and @akturner, I thought I'd tag you on this since it touches the framework, even though this is apparently the first usage of mpas_pool_copy_pool
in E3SM.
great, thanks @sbrus89
@erinethomas and @darincomeau, would you mind taking a look at the framework changes in this PR just in case you see any issues from the sea ice side?
Repeating post from https://github.com/E3SM-Ocean-Discussion/E3SM/pull/87#issuecomment-2037271385
Passes nightly test suite and compares bfb with master branch point on chicoma with optimized gnu and chrysalis with optimized intel. Also passes nighty test suite with debug gnu on chicoma. Note this includes a series of RK4 tests:
00:34 PASS ocean_global_ocean_Icos240_WOA23_RK4_performance_test
01:40 PASS ocean_global_ocean_Icos240_WOA23_RK4_restart_test
01:20 PASS ocean_global_ocean_Icos240_WOA23_RK4_decomp_test
01:20 PASS ocean_global_ocean_Icos240_WOA23_RK4_threads_test
In E3SM, passes
./create_test SMS_Ln9.T62_oQU240.GMPAS-IAF.chrysalis_gnu
./create_test SMS_Ln9.T62_oQU240.GMPAS-IAF.chrysalis_intel
Repeating post from https://github.com/E3SM-Ocean-Discussion/E3SM/pull/87#issuecomment-2037271385. Note these tests all do not use LTS.
Passes nightly test suite and compares bfb with master branch point on chicoma with optimized gnu and chrysalis with optimized intel. Also passes nighty test suite with debug gnu on chicoma. Note this includes a series of RK4 tests:
00:34 PASS ocean_global_ocean_Icos240_WOA23_RK4_performance_test
01:40 PASS ocean_global_ocean_Icos240_WOA23_RK4_restart_test
01:20 PASS ocean_global_ocean_Icos240_WOA23_RK4_decomp_test
01:20 PASS ocean_global_ocean_Icos240_WOA23_RK4_threads_test
In E3SM, passes
./create_test SMS_Ln9.T62_oQU240.GMPAS-IAF.chrysalis_gnu
./create_test SMS_Ln9.T62_oQU240.GMPAS-IAF.chrysalis_intel
@xylar and @matthewhoffman these changes are only used by the ocean core, and only for RK4 and LTS timestepping. I think you are safe with a visual review, as I tried to be thorough in my testing of RK4 and LTS and saw BFB results.
git grep -in mpas_pool_clone_pool
mpas-framework/src/core_sw/mpas_sw_time_integration.F:120: call mpas_pool_clone_pool(statePool, provisStatePool, 1)
mpas-framework/src/framework/mpas_pool_routines.F:434:! routine mpas_pool_clone_pool
mpas-framework/src/framework/mpas_pool_routines.F:444: recursive subroutine mpas_pool_clone_pool(srcPool, destPool, overrideTimeLevels)!{{{
mpas-framework/src/framework/mpas_pool_routines.F:466: call pool_mesg('ERROR in mpas_pool_clone_pool: Input time levels cannot be less than 1.')
mpas-framework/src/framework/mpas_pool_routines.F:909: call mpas_pool_clone_pool(ptr % data % p, newmem % data % p)
mpas-framework/src/framework/mpas_pool_routines.F:921: end subroutine mpas_pool_clone_pool!}}}
mpas-framework/src/framework/mpas_stream_manager.F:411: call mpas_pool_clone_pool(manager % defaultAtts, new_stream % att_pool)
mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:259: call mpas_pool_clone_pool(tendPool, tendSum1stPool, 1)
mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:261: call mpas_pool_clone_pool(tendPool, tendSum2ndPool, 1)
mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:263: call mpas_pool_clone_pool(tendPool, tendSum3rdPool, 1)
mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:265: call mpas_pool_clone_pool(tendPool, tendSlowPool, 1)
mpas-ocean/src/mode_forward/mpas_ocn_time_integration_rk4.F:233: call mpas_pool_clone_pool(statePool, provisStatePool, 1)
in line with @mark-petersen's comment above - I was not able to see any place where this would impact MPAS-SI this morning ... so I think its ok on the sea-ice side...
@mark-petersen, the mpas-ocean rk4
, lts
, and fblts
time stepping routines are the only places that use the mpas_pool_copy_pool
routine I modified:
git grep -in mpas_pool_copy_pool
components/mpas-framework/src/framework/mpas_pool_routines.F:925:! routine mpas_pool_copy_pool
components/mpas-framework/src/framework/mpas_pool_routines.F:935: recursive subroutine mpas_pool_copy_pool(srcPool, destPool, overrideTimeLevels)!{{{
components/mpas-framework/src/framework/mpas_pool_routines.F:958: call pool_mesg('ERROR in mpas_pool_copy_pool: Input time levels cannot be less than 1.')
components/mpas-framework/src/framework/mpas_pool_routines.F:1171: call mpas_pool_copy_pool(ptr % data % p, mem % p)
components/mpas-framework/src/framework/mpas_pool_routines.F:1181: end subroutine mpas_pool_copy_pool!}}}
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_fblts.F:230: call mpas_pool_copy_pool(tendPool, tendSum3rdPool, 1)
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_fblts.F:231: call mpas_pool_copy_pool(tendPool, tendSlowPool, 1)
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:264: call mpas_pool_copy_pool(tendPool, tendSum1stPool, 1)
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:265: call mpas_pool_copy_pool(tendPool, tendSum2ndPool, 1)
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:266: call mpas_pool_copy_pool(tendPool, tendSum3rdPool, 1)
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_lts.F:267: call mpas_pool_copy_pool(tendPool, tendSlowPool, 1)
components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_rk4.F:233: call mpas_pool_copy_pool(statePool, provisStatePool, 1)
passes:
merged to next
merged to master
This PR fixes a memory leak I noticed in the RK4 timestepping when running 125 day single-layer barotropic tides cases with the vr45to5 mesh (https://github.com/MPAS-Dev/compass/pull/802) on pm-cpu. Previously, I could only get through about 42 days of simulation before running out of memory.
This issue is related to creating/destroying the
provis_state
subpool at each timestep. We had a similar problem a few years back that required memory leaks fixes in thempas_pool_destroy_pool
(https://github.com/MPAS-Dev/MPAS-Model/pull/367) subroutine. However, I believe there is still a memory leak in thempas_pool_remove_subpool
routine (which callspool_remove_member
) that is called followingmpas_pool_destroy_pool
. It seems like the TODO comment here: https://github.com/E3SM-Project/E3SM/blob/6b9ecaa67c81c65fe1f7063e5afe63ce9b2c66a9/components/mpas-framework/src/framework/mpas_pool_routines.F#L6036-L6038 suggests it is possible things aren't being completely cleaned up by this subroutine.I'm not familiar enough with the low-level details of the pools framework to track down the memory leak itself. However, in any case, I think it makes more sense to create the
provis_state
subpool once at initialization as opposed to creating and destroying it every timestep. The main consequence of this approach is that thempas_pool_copy_pool
subroutine needs to have aoverrideTimeLevel
option similar to that used inmpas_pool_clone_pool
used previously.I've tested these changes with the vr45to5 tides test case and they do allow me to run for a full 125 days and are B4B with the previously create/destroy approach. The LTS and FB-LTS timestepping schemes are also updated here in the same way to prevent the same memory leak.
Since RK4 is not used in E3SM, this PR is B4B for all E3SM tests. The
mpas_pool_copy_pool
routine modified here is not used in MPAS-Seaice or MALI. See the Ocean-Discussions PR for additional background and testing.[BFB]