Closed mark-petersen closed 4 years ago
The bulk of the code alterations are done. Testing is underway, so there will still be some edits and clean-up.
@mark-petersen - please let me know when you think the Registry changes are done and I'll run the scripts to generate the bld and nml changes
OK, editing namelist flags is done. The Registry file in the mpas-source submodule is correct, or you can download it here:
https://github.com/MPAS-Dev/MPAS-Model/blob/952cd4d619a6bfd251603d26d4b1962aa9ef707c/src/core_ocean/Registry.xml
If you want to double check, the very last flag I altered was changed to:
config_mesoscale_eddy_isopycnal_slope_limit
That last flag change was in what I got checking out the branch, so it all looks up-to-date. I'm running the scripts now to get the bld and nml changes
@mark-petersen - changing so many variable names has made the automated script part a bit more complicated, so I'll have to run some of the issues by you, if that's OK:
Thanks
Is there a switch to turn Redi mixing off and revert to previous answers?
@rljacob yes, if we set config_use_Redi = .false.
then it will be bfb with previous. Would you prefer that, and then turn it on in a following PR? I agree that provides more thorough testing.
That is up to @golaz
@jonbob I agree there are a lot of changes. Thanks for the help.
- I assume we want to change all the grid-specific settings of config_use_standardGM to now use config_use_GM?
Correct. config_use_standardGM -> config_use_GM
- we used to have the default for config_standardGM_tracer_kappa set to 1800 except for G-cases using oEC60to30v3wLI was 600.0 -- do we want those defaults the same but now for config_GM_kappa?
Correct. config_standardGM_tracer_kappa -> config_GM_kappa Let's leave all the settings in tact for each resolution. We may alter these later, after tuning exercises.
- config_GM_Bolus_kappa_function was 'constant' by default -- do we want config_eddying_resolution_taper to be 'ramp' by default? 'constant'?
Let's leave it 'constant' right now so we have the option to test bfb on this PR, as stated just above. Once Redi is turned on, we could put it on 'ramp' for all runs and the tuning exercise.
- is "N2_dependant" the correct spelling of the GM_closure variable?
That's a misspelling on my part, thanks. Changing it now...
- in E3SM, config_disable_redi_k33 had previously been true by default -- do we now want it false?
Yes, false. Good thinking. The flag config_use_Redi
should turn on all four terms in E3SM.
My two cents is that we put this PR in with config_Redi=.false.
and then turn it on later. This would make integration testing more straightforward and unified.
@mark-petersen - I committed my bld updates, but they should be tested with all the changes
I'm running a test G-case with my head of this branch
I am still propagating the namelist changes through the code. Should be done tomorrow morning.
@mark-petersen - I checked out the current version of this branch and ran a SMS.T62_oEC60to30v3wLI.GMPAS-IAF.anvil_intel test. The cmake build failed but the old make one worked fine. The results of a five-day smoke test were BFB with one from before this update. I'll see if I can figure out the cmake problem...
This should now work with cmake and remain bfb with previous.
@mark-petersen: is this ready to try, or should we wait for other fixes from the MPAS-O PR?
I think everything is fixed. I went through the algorithm and code again and found the sign error. I need to rerun my validations, but we could start with a 5-day test with
config_use_Redi = .true.
in E3SM with EC60to30.
@mark-petersen and @milenaveneziani I think we should still try a stand alone 60to30 with only redi mixing enabled before doing too much in E3SM. But a 5-day simulation with redi on would be a good smoke test. From this thread it seems @jonbob may be trying this test already? Or maybe that is just a BFB with redi off?
I'm running that test right now with an updated code base
Passed an SMS.T62_oEC60to30v3.GMPAS-IAF.anvil_intel smoke test
After the last set of revisions, the diffusion is moving the right way, but I'm seeing some noise: and I'm looking into that now and redoing term-by-term verification. No use running a G case until the idealized results look good.
If we want to add the three debug tracers to a G case, set:
config_use_debugTracers = .true.
and you can add initial tracer fields from this EC60to30 file on IC:
/lustre/scratch4/turquoise/.mdt3/mpeterse/runs/200109_redi_add_triads_EC60to30/ocean/global_ocean/EC60to30/spin_up/simulation/restarts/restart.0001-01-01_00.30.00.nc
ncdump -h restart.0001-01-01_00.30.00.nc|grep double|grep tracer
double tracer1(Time, nCells, nVertLevels) ;
double tracer2(Time, nCells, nVertLevels) ;
double tracer3(Time, nCells, nVertLevels) ;
When debugTracers is on, it now has these three tracers.
It turns out that the noise was coming from the T&S field, as I had a strange mix of terms on and off. If T&S are completely fixed in time, and the only tracer term is Redi, then it looks good qualitatively:
@mark-petersen that's good news. Just so I understand, if you turn off all T&S terms but redi this still works (no noise in passive tracer) as Redi shouldn't do anything? Is that right?
I'm almost done with all the testing of this Redi PR. Things are looking good, and I'll update and post all the details soon. There are some minor revisions since last Thursday.
One question for @vanroekel : To get bfb restarts, we need to either add indexBoundaryLayerDepth to the restart, or recompute it on start-up. That is because Redi Kappa is zero down to the BLD and then tapers to the full value below that. indexBoundaryLayerDepth is computed in cvmix, so the easy thing is to add it to the restart file. It is 2D, so it doesn't require much space. We currently have the real-valued BLD in the restart file.
@mark-petersen thank you for the update. For your question, I think I would prefer that indexBLD gets stored. Figuring out how to correctly call into CVMix in restart sounds more complicated and prone to error and as you say it is a 2D int field, so should not take up much space. It also makes sense to have this in the file with the full BLD.
@mark-petersen, FYI: last Thursday I started two G-cases on anvil: they are identical except that one has Redi on and one has it off. I have had some unexpected failures on a random year, but while the Redi-off case is now running fine and it's at year 31, the Redi-on case failed at year 27 with Nan detection errors. Nans are apparently found in the layer thickness, normal velocity and active tracers, at location: ~1degS, 6degE, which corresponds to an ocean point just off of Central Africa. This is what I could gather so far from the log files.
Apologies, forgot that those are radians not degrees.. Actual location is in the Southern Ocean, west of the Drake Passage. I am looking at some fields closely: will post a meaningful plot when I have it.
ok, so, this is what I could find so far: 1) I first looked at the 5-day instantaneous fields (from the highFrequency files) and did not notice anything unusual there; 2) I also noticed that the model had died during the previous month (July), for the same reason, but since the Redi-off case had also stopped at one point, I thought it was just some random node-related failure. So, I restarted the Redi-on case without investigating, and the nan-problem showed up again in August. 3) I plotted the output of the debug mpas_ocean file and there you can clearly see where the weird T and S values showed up at model time=0027-08-24_20:30:00 (Weddell Sea):
Note that all values that are unrealistically negative have been associated to -9999 (dark blue dots). By doing that I also 'masked' the nan values associated with z>local_depth, sorry about that, but I added the ocean depth plot for reference.
@milenaveneziani, @jonbob, @vanroekel, @maltrud My verification exercises are complete, as described on the MPAS PR: https://github.com/MPAS-Dev/MPAS-Model/pull/280#issuecomment-576520911. I rebased this E3SM PR with updated MPAS-Ocean. The commit last Thursday was preliminary, and those Redi results are not valid. I am confident that we can proceed with this PR for global testing now.
Here is a summary of where this PR stands:
We can now proceed with E3SM low-resolution tests. We need to consider the following:
config_use_debugTracers = .true.
config_reset_debugTracers_near_surface = .true.
config_reset_debugTracers_top_nLayers = 20
and watch this tracer distribution in time. debugTracers are normally off in E3SM.
config_Redi_surface_taper_layers = 3
config_Redi_bottom_taper_layers = 3
config_Redi_zero_in_boundary_layer = .true.
config_Redi_BLD_taper_layers = 3
@mark-petersen: could you please explain these two options a bit more?:
config_Redi_zero_in_boundary_layer
config_Redi_BLD_taper_layers
how do they work in condjuction (or not) with the taper_layers
options?
Talked with @mark-petersen in person about this and now things are clear to me.
I was thinking perhaps we could have config_Redi_surface_taper_layers = 10
, since the BLD options take care of how the tapering is done when BLD<upper 10 layers?
I will re-run the Redi-on case, while leaving the Redi-off case as is (Redi-off G-case should complete 50 years by tomorrow).
doesn't the bottom 10 layers seem large? This would be ~2000m.
no, this would be the top 10 layers. The bottom tapering is regulating by config_Redi_bottom_taper_layers
and yes, I wouldn't make that more than 3.
I would recommend running two simulations. One with low taper layers and one with high, and see how the min/max behaves.
Low taper layer numbers
config_Redi_surface_taper_layers = 3
config_Redi_bottom_taper_layers = 3
config_Redi_zero_in_boundary_layer = .true.
config_Redi_BLD_taper_layers = 3
High taper layer numbers
config_Redi_surface_taper_layers = 10
config_Redi_bottom_taper_layers = 5
config_Redi_zero_in_boundary_layer = .true.
config_Redi_BLD_taper_layers = 5
It would be better to use the lower taper values wherever possible, but I suspect we will see noise near boundaries, and need to go higher. Having two runs right away will give us some idea of the spread.
Seems to me that we can set config_Redi_BLD_taper_layers
independently from the surface and bottom tapering, since this is tapering below the BLD. I think 3 layers tapering below the BLD is more than enough?
It looks like they can be set independently. I think (hope?) 3 is more than enough the for the BLD_taper layers, but I agree with @mark-petersen that we should see the bounds/impact of the large and weaker tapering.
I was talking with @mark-petersen and we agreed that we can always back off on the BLD tapering, once we see the results of the two tests above. So, I will start the tests, also adding 5-daily 3-d tracers in the high-frequency output, so that we can hopefully see what kind of noise we get.
@mark-petersen - I did a series of smoke tests with a new checkout of this branch. I can confirm that the results are still bfb with master, when running with the default settings (redi off). I ran another test turning redi on (config_use_redi = .true.) but the results did not change because redi_kappa is 0 by default. I had to add config_redi_kappa = 600.0 to user_nl_mpaso to get redi to have an impact.
I have two G-cases waiting in the queue on anvil: one is with Low-tapering and the other one is with High-tapering options. Hopefully they'll start running soon. I'll let them go for 6 months and analyze results. I am saving 5-day snapshots of all tracers, MLD, and BLD, for diagnostics purposes. The Redi-off case is currently running: I'll stop it at year 50.
@jonbob: The MPAS PR is merged, and MPAS-Dev/e3sm/develop is now updated. I just updated the tag and first comment here to be BFB. So this PR is ready to test on next and be merged.
@vanroekel since this is bfb and has the triad formulation, could we merge it in? When you have a tapering scheme you are satisfied with, I think it would make sense to do an additional E3SM PR at that time. Then we can proceed with other ocean PRs in the mean time. If you agree, please approve this one.
@mark-petersen I've been running/testing this PR for awhile and think I have found a few bugs in the redi implementation outside the scope of tapering. I think I may be nearly ready to submit a PR and am running MPAS stand alone tests now. In the first month, I'm seeing no decreases in the minimum temperature and instead a slow increase. I would like to wait for further testing on those changes before merging Redi code to E3SM.
Status: @vanroekel has added new tapering options here: https://github.com/MPAS-Dev/MPAS-Model/pull/445 We are still seeing temperature values drift below -1.8C, but are testing solutions to that now. We hope to be done next week, and will then be able to merge this in.
Rebased. The flags were updated once in the previous commits, but Redi flags in MPAS-Ocean has been updated since then. @vanroekel you must already have an E3SM branch with updated flags and your preferred default values? Or does it make more sense for @jonbob to rerun scripts for the e3sm namelist files?
@mark-petersen I think we should definitely have @jonbob rerun scripts. My e3sm branch of this is hacked together and shouldn't be used for this PR.
Agreed -- even if we got an old branch functional, I'd prefer to have the supporting scripts created by the autogeneration tools
I'm going through the generated scripts and have just a couple of questions. There's a point where we resolve the differences between default namelist values between what's in the Registry and what's previously been the default. For config_Redi_kappa, Registry has 600 and of course it's been 0 by default. I'm guessing we want to set it to 600 now? Or should that be resolution-dependent? Redi is still off by default, but it would seem better if our best guess at the correct values were there so turning it on is simply a matter of flipping config_use_Redi...
A similar question for config_GM_kappa. Currently the defaults look like:
<config_GM_kappa>1800.0</config_GM_kappa>
<config_GM_kappa ocn_forcing="datm_forced_restoring" ocn_grid="oEC60to30v3wLI">600.0</config_GM_kappa>
And config_eddying_resolution_taper -- "ramp" or "none"?
Add capability to the Redi isopycnal mixing scheme, which is required for non-eddying resolutions. It includes:
According to Griffies et al 1998 this triad expansion of the slope calculation (my steps 1 and 2, his eqns 30-34) is total variance diminishing, and essential when using a nonlinear equation of state. It was used in MOM and POP. Though noise may still exist, it is much reduced.
This PR leaves Redi mixing off by default, but is non-BFB due to mixing changes. After tuning exercises we will have a later PR that turns Redi mixing on for low resolution, with the proper settings.
[NML] [non-BFB]