E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
336 stars 338 forks source link

Fixes for multiple issues in gustiness calculations. #5850

Closed quantheory closed 7 months ago

quantheory commented 10 months ago

This is designed to fix the following issues:

The first of these was fixed by forcing ATM_SUPPLIES_GUSTINESS=TRUE for all EAM cases, and the second was fixed purely by rewriting the Fortran formulas. These two changes are climate changing, and will generally tend to reduce the impact of gustiness on the mean winds, i.e. mean wind speed will increase.

The third fix is to the U10 diagnostic only, and will decrease the output U10. A value that has not been "fixed" in this way can be obtained by outputting the new history variable 'U10WITHGUSTS' from EAM.

The final fix is to remove some double-counting between gustiness parameterizations in EAM and ELM. The fix can be "undone" by setting force_land_gustiness = .true. in the ELM namelist.

Note that these fixes are only applied over land and ocean. At the request of sea ice developers, this tag also disables gustiness over sea ice completely.

Fixes #5834 Fixes #5835 Fixes #5836 Fixes #5837

[CC] [NML]

quantheory commented 10 months ago

This PR passes the e3sm_developer tests. I am running the super-BFB tests just in case. Edit: They all passed except for one that is also failing for me on master (#5851). This tag should be BFB for compsets with no active atmosphere (e.g. I compsets); this seems to be the case but I've only checked a couple of configurations.

With this PR, it is possible to isolate the effects of each of the four main changes individually by comparing different configurations of this code to different configurations of the original. Consider running the same compset with active atmosphere in each of the four code/configuration combinations below:

  1. Current master with all default settings.
  2. Current master with xmlchange ATM_SUPPLIES_GUSTINESS=TRUE.
  3. This PR with force_land_gustiness=.true. and U10WITHGUSTS added to the EAM output.
  4. This PR with default settings except that U10WITHGUSTS is added to the EAM output.

Then the fix for #5834 can by isolated by comparing 1 and 2, the fix for #5835 can be isolated by comparing 2 and 3, the fix for #5836 can be isolated by comparing the U10 and U10WITHGUSTS outputs within either 3 or 4, and the fix for #5837 can be isolated by comparing 3 and 4.

That said, I don't know if we would learn that much from examining configuration 2 in that much detail, except for debugging or sanity check purposes. To figure out what this does to the climate, 1 and 4 are obviously of the most significance, so at least those two need to be run out for some time with the coupled model, with configurations 2 and 3 maybe only needing to be run out for a few years as a sanity check. For all of these cases, the effect on U10 and the surface fluxes is probably large enough that it won't take all that much time to see what the effect is at least qualitatively.

Note that in the meeting last week we did not really prioritize a fix for the double-counting issue (#5837), but it was not too bad to add today.

quantheory commented 10 months ago

@mabrunke Please double-check this. Particularly clubb_intr.F90, the ELM BareGroundFluxesMod.F90 and similar files, and shr_flux_mod changes. Also, I made the U10 change in cam_diagnostics.

@huiwanpnnl I made a change to the conditional_diag_main module, so that the U10 calculated there would match the one in the cam_diagnostics module.

quantheory commented 10 months ago

@bishtgautam The atmospheric gustiness should be handled correctly in lnd_import_export.F90, but the corresponding changes have not been made in lnd_downscale_atm_forcing.F90 (or lnd_disagg_forc.F90, which seems to be unused code). Can you let me know whether this is supposed to be a purely offline downscaling for use with datm, or if this is supposed to work in an "online" manner when EAM is running? If the latter, we need to change that module as well.

quantheory commented 10 months ago

@wlin7 @rljacob Tagging you in case you also want to take a look at this.

mabrunke commented 10 months ago

@quantheory, I have some code changes already for lnd_import_export.F90. That is used no matter whether ELM is used offline or coupled to the atmosphere.

quantheory commented 10 months ago

@mabrunke The specific reason why I'm concerned about lnd_import_export.F90 is these lines of code, which explicitly prohibit combining downscaling and atm_gustiness. I don't remember why that check is there. It may be that I added it because I just wasn't sure what to do with the gustiness field if downscaling is being used. Though perhaps it's as simple as just copying what's being done for ubot and vbot.

mabrunke commented 10 months ago

@quantheory wind speed is not being downscaled at this time; it is just being copied to all of the topounits. I believe windbot is only used as a diagnostic in the ELM output as what it sees the bottom-layer wind speed to be. Without downscaling, gustiness has been included in that output variable.

quantheory commented 10 months ago

@mabrunke I guess what I'm trying to figure out is that we have these three places where that copying seems to be being done, and I'm wondering if I need to do the same thing for ugust as is being done for ubot and vbot in those lines. (As opposed to the current behavior, where if atm_gustiness is on, the code simply aborts before calling that function!)

mabrunke commented 10 months ago

@quantheory, can you provide the line of code where that abort happens. I'm not seeing it. Is it in lnd_import_export.F90?

quantheory commented 10 months ago

@mabrunke Yeah, right here a few lines before the call to the downscaling would otherwise happen:

https://github.com/E3SM-Project/E3SM/blob/18198fbbc538287f606501a933382a7be976db99/components/elm/src/cpl/lnd_import_export.F90#L1121-L1123

mabrunke commented 10 months ago

@quantheory Ah, I see it now. I don't see why applying gustiness if there is downscaling would be a problem at this point, since wind speed is being downscaled right now.

quantheory commented 10 months ago

@mabrunke OK, I can add another commit for to allow applying gustiness with the downscaling. Do you think this PR looks good other than that?

mabrunke commented 10 months ago

@quantheory A few things: I like your idea of adjusting U10 in EAM when its being outputted there, but U10 is also outputted in ELM. It should be adjusted for that output as well. Also, you changed the surface flux calculation in CICE, but isn't MPAS-SeaIce now being used in v2/3?

quantheory commented 10 months ago

@mabrunke Those are both very good points. Regarding the U10 output, if I need to calculate the change in multiple components anyway, now I'm kind of considering going back to calculating it at the surface rather than in EAM. (U10 is supposed to be measured relative to the surface right? I.e. a measure of u_atm - u_ocn over the ocean, rather than just the atmosphere speed in an Eulerian frame? That seems to be how it works in shr_flux_mod.F90.)

It shouldn't be difficult to make the changes to MPAS; I really just forgot because the original ATM_SUPPLIES_GUSTINESS changes were done back in v1, and so my list of files to change had CICE but not MPAS ice on it. Let me see what I can work out today.

mabrunke commented 10 months ago

@quantheory, truly the fluxes should be calculated using u_atm - u_sfc as well as the temperature and humidity differences, but, over land, u_sfc = 0. That is not the case over the ocean, as there is a surface current.

I was handling the adjustment in the surface models. That was why I suggested making ATM_SUPPLIES_GUSTINESS = TRUE. That would complicate things if you want to save the two different versions of U10 like you are now. I'm not sure yet if that's a good idea, since the old way of calculating U_gust was wrong anyways.

There is also a sea ice formulation in shr_flux_mod.F90. I'm guessing that that is a relic of an older version of the model and is no longer used.

proteanplanet commented 10 months ago

Why is CICE being discussed in this thread? E3SM does not use CICE. But any changes to U10 will also affect ocean and sea ice coupling in B-cases if it is being changed in the coupling, rather than just output, and if such changes are being proposed, people from the ice and ocean groups need to be assigned as reviewers. If the changes only affect F-cases, then MPAS-SeaIce is still involved, and the CICE tag should be removed, and in its place sea ice or MPAS-SeaIce should be tagged.

quantheory commented 10 months ago

@proteanplanet

Why is CICE being discussed in this thread? E3SM does not use CICE.

Because it's building on changes that were originally developed in E3SMv1, and because this is the first PR I've made affecting sea ice in several years (partly due to not working directly on E3SM at all for a while). So as I mentioned earlier, I forgot that I needed to port the changes over to MPAS.

people from the ice and ocean groups need to be assigned as reviewers

I agree, but being rather atmosphere-focused, I don't actually know who to ask to review it. Also, given the problems mentioned in the above discussion, I need to make some changes before getting anyone to review this again anyway.

I was planning to ask @wlin7, but if there's anyone you want to suggest someone as a reviewer (or if you feel you should do it yourself), feel free. Though again, I need to port the changes to MPAS before it's worth bothering.

sea ice or MPAS-SeaIce should be tagged

There is no generic sea ice tag. The tag list seems a bit inconsistent in this regard, honestly. I had no idea whether to tag mpas-ocean or not, because I'm not touching the MPAS-ocean code at all, but I am making changes to coupler code that calculates atmosphere-ocean interactions. But there's no generic "Ocean" tag, only "mpas-ocean". My hope was that tagging Coupled Model would be enough.

proteanplanet commented 10 months ago

@quantheory I suggest either making me or @eclare108213 a reviewer. Thanks for adding the mpas-seaice tag.

mabrunke commented 10 months ago

@quantheory, the surface fluxes are calculated in column/ice_atmo.F90 in MPAS-SeaIce. It doesn't use ugust, so it must expect to receive the gustiness in vmag.

quantheory commented 10 months ago

@mabrunke Yeah, I noticed earlier, it looks like no one touched anything related to the gustiness export code since I first added it, so I have to port it from CICE today. Luckily this code has shared ancestry with the CICE version, so it should be easy to copy over once I've read through the MPAS version.

proteanplanet commented 10 months ago

@quantheory, if you intend this work to make it onto E3SM master, please add a sea ice modeler to the review as per the above message - choose any of @eclare108213, @darincomeau, @njeffery, @erinethomas or me. If this is for your own project and you are using it solely for that, that's fine. If you intend this work to enter E3SM master, the changes must first be made to MPAS-SeaIce, not CICE. Moreover, the changes should be made to Icepack not to the MPAS-SeaIce column package, because Icepack from the CICE Consortium is replacing that for E3SM V3. Therefore your changes would need to pass review in the CICE Consortium Icepack code at https://github.com/CICE-Consortium/Icepack. I am happy to help in testing the code in E3SM, and welcome improvements, but we will need to make sure it is thoroughly tested in D-, G- and B-cases first.

quantheory commented 10 months ago

@proteanplanet As I mentioned before, please wait for additional commits before reviewing this (I will tag you when this is done). I did not add you as a reviewer before, because I didn't think it was worth your time to look at the code when several significant changes were about to be made anyway. I have been out of the office part of this week, so I haven't been very fast about this.

I think there is some misunderstanding about why I made this PR. My expectation is that once this PR is reviewed for code correctness, it will be referred to the coupled model group and relevant component leads to determine how to evaluate this using coupled model runs, and determine what to do from there. No one involved has any intention of trying to get this merged unless and until that process is complete. I tagged this "Do Not Merge" for a reason.

Furthermore, I am not a part of the E3SM project, and it is not my responsibility (or in fact, within my ability) to do all of the necessary climate evaluation, or the integration with v3 features. I put together these bug fixes because I believe we all share an interest in ensuring that the model code is a correct implementation of the physical equations it represents, and as a courtesy to other E3SM developers to not have to start from scratch when implementing these fixes for v3. I have neither the knowledge nor the time to shepherd this PR through merging to master and/or integration with v3 by myself. That will require the assistance of E3SM developers in the coupled model and other groups and the help of the integrators, so rest assured that you will hear much more about this, and there will be several more rounds of review, before anyone tries to get this merged.

proteanplanet commented 10 months ago

@quantheory Thanks. I'll help with the testing when you are ready.

eclare108213 commented 10 months ago

My expectation is that once this PR is reviewed for code correctness, it will be referred to the coupled model group and relevant component leads to determine how to evaluate this using coupled model runs, and determine what to do from there. No one involved has any intention of trying to get this merged unless and until that process is complete. I tagged this "Do Not Merge" for a reason.

@quantheory Thanks for reaching out. I'm glad you're planning to work with the coupled group and the relevant component leads, since the implementation needs to be consistent across all of them and sometimes that can be pretty tricky. Please change this to a Draft PR to make it clear that it's only for review purposes, not staged for merging into E3SM master. To get this all the way through the review process into E3SM, you'll need to follow the steps outlined here. Do you have a design document for these changes?

quantheory commented 10 months ago

@eclare108213

Please change this to a Draft PR to make it clear that it's only for review purposes, not staged for merging into E3SM master.

Ah, interesting! I don't think I've used that feature before. Done.

Do you have a design document for these changes?

Yes and no. The original gustiness feature, developed over a few years and completed in 2019, seems to have a design document here. This PR is a collection of bug fixes for that older feature, not a new feature in itself.

Part of the problem is that the original implementation tried to handle gustiness entirely within the atmosphere, without modifying the land or sea ice components or sending any "new" information through the coupler. (I assume this was in part to reduce the complexity of the required code modifications.) This approach neglects some changes that really need to be made to the other components. Hence, fixing this feature requires reworking the design to add/change a few lines of code in the land and sea ice models, even though the original feature did not touch the ELM or CICE (or MPAS-Sea Ice) code directly.

quantheory commented 10 months ago

@proteanplanet @eclare108213 The changes to MPAS-Seaice are now implemented. (I also kept the CICE changes because CICE is still used for some tests, e.g. of EAM's single-column model.) This is the first time I've done anything with the MPAS-Seaice code, so please help me out if I've made any basic mistakes here. I have not added any outputs related to gustiness to the default output, but you may wish to do so. When EAM is not being used, I've tried to use a gustiness value of 0 m/s, which should result in the same answers as on the current master.

There is some kind of stability issue with this branch in at least one WCYCL1850NS test. I've filed issue #5884 about that, because it seems like this is a consequence of an issue on master unrelated to this tag. But that appears to be primarily an atmosphere-land interaction issue, so whatever we do to resolve that is unlikely to have a significant effect on the MPAS-Seaice changes.

quantheory commented 10 months ago

@bishtgautam @mabrunke @XubinZeng I have removed both atmosphere- and land-calculated gustiness (wc) from U10 in this revision. The removal of wc is the one change that I expect to affect answers without EAM running, but I believe it should only affect the U10 output of ELM.

rljacob commented 9 months ago

status: wcycl1850NS test is failing. threading non-BFB issues.

rljacob commented 8 months ago

status: unchanged from above. Also will need a B-case test.

mahf708 commented 8 months ago

This PR has nontrivial impact on the aerosol ERF (see table below in W/m2), though it is unclear to me yet what the exact pathway is. Nonetheless, the nontrivial impact isn't large enough to warrant fuller investigation.

Description TOT ARI ACI
baseline -0.736 +0.096 -0.926
PR 5850 -0.707 +0.101 -0.905
mahf708 commented 8 months ago

One additional note on this PR from my testing: it is likely exacerbating (doubling) the machine-dependency (likely optimization-dependency) issue we've been tracking. I still don't recommend any action/investigation. Just a note for completeness.

quantheory commented 8 months ago

@proteanplanet @eclare108213 Sea ice changes are now removed from this branch (except for 3 lines needed by the atmosphere for diagnostic purposes).

crterai commented 8 months ago

I have some results to share from the test simulations which were run after the removal of gustiness impacts on sea-ice. Here are the model vs model diagnostics from the F20TR simulation: https://portal.nersc.gov/cfs/e3sm/terai/E3SMv3_dev/e3sm_diags_Gustinessfix_vs_baseline_F20TR/viewer/ I have also run 30 years with the WCYCL1850 in trigrid format, to be compared with the v3alpha04 simulation: https://portal.nersc.gov/cfs/e3sm/terai/E3SMv3_dev/e3sm_diags_WCYCL1850_gustinessfix_vs_v3alpha04_231017/viewer/ MPAS diagnostics from the 30 years with same simulation are here: https://portal.nersc.gov/cfs/e3sm/terai/E3SMv3_dev/20231011.piControl.Gustinessfix.v3alpha04_trigrid.pm-cpu_intel/mpas_analysis/ts_0001-0030_climo_0001-0030/ocean/index.html

Summary

crterai commented 8 months ago

What's concerning with the coupled simulation is a crash that the coupled model encountered 38 years into the simulation with the following error:

/pscratch/sd/t/terai/E3SMv3_dev/20231011.piControl.Gustinessfix.v3alpha04_trigrid.pm-cpu_intel/run/e3sm.log.17098756.231016-200022  

 256: PIO: FATAL ERROR: Aborting... An error occured, Writing variables (number of variables = 181) to file (./20231011.piControl.Gustinessfix.v3alpha04_trigrid.pm-cpu_intel.elm.h0.0038-02.nc, ncid=469) using PIO_IOTYPE_PNETCDF iotype failed. Non blocking write for variable (SNOWDP, varid=157) failed (Number of subarray requests/regions=1, Size of data local to this process = 64800). NetCDF: Numeric conversion not representable (err=-60). Aborting since the error handler was set to PIO_INTERNAL_ERROR... (/pscratch/sd/t/terai/E3SM_code/quantheoryE3SM/externals/scorpio/src/clib/pio_darray_int.c: 395)

Before this, I also see the following error messages. I haven't seen them before, and don't know if these are to be expected:

1871: sl_advection: reconstruct_and_limit_dp (rank,ie) returned   1871      1-3.1438E-01
2584: sl_advection: reconstruct_and_limit_dp (rank,ie) returned   2584      1-1.6182E-01
1988: sl_advection: reconstruct_and_limit_dp (rank,ie) returned   1988      1-1.2616E+00
1871: sl_advection: reconstruct_and_limit_dp (rank,ie) returned   1871      1-4.9239E-01
2985: sl_advection: reconstruct_and_limit_dp (rank,ie) returned   2985      1-1.7805E-01
2986: sl_advection: reconstruct_and_limit_dp (rank,ie) returned   2986      1-6.9487E-01

@xuezhengllnl , @ambrad, @wlin7 do you have any thoughts on what might be going wrong and whether I should be concerned with the sl_advection warning lines? Run script is here: https://github.com/E3SM-Project/SimulationScripts/blob/master/archive/NGDAtmPhysIntegration/run.20231011.WCYCL1850.Gustinessfix.v3alpha04_wVBSfix.trigrid.pm-cpu.sh

ambrad commented 8 months ago

concerned with the sl_advection warning lines

Those lines are in general fine, although if you see a lot of them, that indicates instability in the model. Note that these lines are printed only in debug builds, specifically if NDEBUG Is not defined in the build. Am I correct that you're using a debug build?

XubinZeng commented 8 months ago

Hi Chris (Terai), I don't know what might be wrong. a couple of other comments: a) from your results, I see the slight decrease of aerosol emission from oceans (e.g., sea salt) - which is consistent with what we expect. this might have a small effect on aerosol radiative forcing.

b) if possible, could you compare the current simulation (without gustiness impact over sea ice) versus the 40-year coupled simulation you did before (including gustiness impact over sea ice). then share the plots with us. from this comparison, we may gain some insights. thanks, Xubin

crterai commented 8 months ago

Those lines are in general fine, although if you see a lot of them, that indicates instability in the model.

There are a good number of these errors and they appear decades earlier as well in the simulation. So far the mean-state diagnostics don't indicate any instabilities, but things might be occurring at shorter frequency.

Am I correct that you're using a debug build?

I didn't intend to use a debug build, and I don't see that I did a debug_compile, so I don't know why they are printing if they are only intended in debug builds.

crterai commented 8 months ago

Adding @beharrop as a reviewer, since he'd been involved in the original implementation of the gustiness parameterization.

ambrad commented 8 months ago

@crterai what code version are you using? I'm wondering if recent cmake changes have affected flags like -DNDEBUG. Thanks.

By the way, the debug output isn't an error but just a message about a limiter being active. Generally I expect the limiter to have to act about once over the sphere every few time steps, almost always in the top-most layer. But I do intend this message to appear only in debug builds; you shouldn't be seeing this message at all.

Edit: Just spoke with the CMake guru, who is making a bunch of changes to the build system, and indeed -DNDEBUG was accidentally removed from release builds in recent master versions. He's working on a fix now.

crterai commented 8 months ago

For the water cycle runs, I just used the PR branch, which built on top of 732e699695c67023880cb78937e21b5e66072879 of master, but it sounds like you've identified why it was printing those messages.

And thanks for the info about the sl_advection messages not being error messages. I'm writing a restart close the crash and hopefully learn a bit more from extra output/checking flags.

crterai commented 8 months ago

@XubinZeng - Thanks for your input. I think we have a slight increase in the emissions of sea-salt aerosols (please see table below). We do see a decrease in the dust emissions though, which is consistent with not double counting the gustiness over land. https://portal.nersc.gov/cfs/e3sm/terai/E3SMv3_dev/e3sm_diags_Gustinessfix_vs_baseline_F20TR/viewer/table/20231011.v3alpha04.F20TR.GustinessFix.v3alpha04_wSOAGfix.pm-cpu_intel-ANN-budget-table.html

As for the comparison with the earlier WCYCL1850 simulation, the base simulation that they are based off of are different (v3alpha02 vs v3alpha04), so I'm wary of comparing the two simulations and attributing any differences to the changes in whether the sea-ice gustiness was included or not.

XubinZeng commented 8 months ago

@crterai - thanks for the reply. agree on both points.

crterai commented 7 months ago

@proteanplanet - would you review the changes to the code changes that shoul now have removed the impact of the gustiness on the sea-ice?

crterai commented 7 months ago

@quantheory - I think we are close to finalizing this PR for merging. Would you remove the 'Draft' label to this PR?

crterai commented 7 months ago

Thanks for the reviews, @beharrop and @proteanplanet.

crterai commented 7 months ago

An update on the coupled simulation crash at the end of 0038-02 due to Non blocking write for variable (SNOWDP, varid=157) failed:

vanroekel commented 7 months ago

@crterai do you have more than 30 years of simulation? I thought climate changing simulations required 100 years? Through 30 years the ocean seems mostly unchanged however I noted that OHC anomalies seem to be stronger in the gustiness run so would be useful to see a longer run.

crterai commented 7 months ago

@vanroekel - we have up to 37 years of the water cycle simulation before it ran into a model crash while writing a diagnostic output variable (SNOWDP). I'm guessing 7 more years will not be enough for an assessment? I am working with the developers to try to get to the bottom of why it is crashing and will update here once we have a fix and longer (50+ year) coupled simulation. Do you notice any red flags from a coding perspective?

vanroekel commented 7 months ago

Thanks @crterai I did look through the code and the only modification to the ocean seems to be changes to the wind stress we receive in the ocean so not really directly affecting the component like as it did the sea ice. As for the length, 37 years is not much better than 30. I do think we should push for 100 as that is the code review policy, but will let @golaz chime in there. If Chris G is okay with a shorter testing simulation and I'm out on travel feel free to remove me as a reviewer and proceed.

rljacob commented 7 months ago

update: runtime fails have gone away. Should be ready for alpha05.