Making the old CAM diagnostics package work with NorESM2.5

oyvindseland commented 1 month ago

Issue Type

Other (please describe below)

Issue Description

Create workflow so that output from CAM in NorESM2.5 can be analyzed with the old diagnostics package. Current status: Given the changes below the diagnostics package works for a somewhat random selection of 2D plots. Working on experiment output on Betzy: /cluster/work/users/oyvinds/archive NF2000oslo_alpha_ne30pg3_ne30pg3_mtn14_20240430

Current workflow. Rename w to gw: ncrename -v w,gw EXP.cam.h0.000${yy}-${mm}.nc Add P0 ncks -A -v P0 FV09file.nc EXP.cam.h0.000${yy}-${mm}.nc FV09file.nc is a NorESM2 file, any output file will do. Problem. The ncks is memory demanding so any script loop stops after 1-3 files. For the experiment I have run the command manually.

The command /cluster/shared/noresm/diagnostics/noresm/bin/diag_srun -m cam -c NF2000oslo_alpha_ne30pg3_ne30pg3_mtn14_20240430 -i /cluster/work/users/oyvinds/archive -s 1 -e 2 creates the diagnostics output but only for a limited and partly random selection of 2D fields. /cluster/work/users/oyvinds/diagnostics/www/NF2000oslo_alpha_ne30pg3_ne30pg3_mtn14_20240430/CAM_DIAG/yrs1to2-obs

The likely reason for this is that also some of the 2D fields sets, e.g. set 5 also need 3D information, and this fails due to missing 3D information fatal:["Execute.c":6394]:variable (hyam) is not in file (inptr) This error likely ends the plotting of the remainder of the set.

The variables hyam and hybm exists in the original output files, but the averaged fields created by the script does not contain these values. For CAM6 / NorESM2 this variable exist in the files both before and after averaging. The related variables hyai and hybi is explicitly included in the averaging process /cluster/software/NCO/4.9.3-intel-2019b/bin/ncclimo .... gw,ilev,hyai,hybi and as a result is also included in the averaged file for both simulations. The hyam and hybm variable is not mentioned explicitly but are included in averaged files for NorESM2 output

Will this change answers?

No

Will you be implementing this yourself?

Yes

oyvindseland commented 1 month ago

I tried to experiment with adding hyam and hybm to the files after averaging but I was unable to find the script that decides if the averaging should be done or not. In an older version of the diagnostics there used to be a test checking if the files were already there or not. I also notice that the files are touched twice. First only creating grid information but lacking hyam and hybm and then just adding the main variables.

gold2718 commented 3 weeks ago

@oyvindseland, @MichaelSchulzMETNO, @YanchunHe, @mvertens

I have spent some time investigating why the SE history files do not have P0, hyam, or hybm.

Conclusion: I now believe they should not be on the history file and should not be used for diagnostic processing of SE history output.

Background: For many dynamical cores, including FV, the vertical coordinate is constructed from the hybrid coefficients (hyam and hybm) with P0 serving as the reference pressure. However, the spectral-element dycore (SE) uses a dry dynamical coordinate. The pressure-coordinate terms (hyam, hybm, and P0) do not describe this vertical coordinate.

The old diagnostic package does not work without these hybrid-coordinate terms because they are fed to the NCL function, pres_hybrid_ccm which uses them to do vertical interpolation. This interpolation will not be correct for SE output and will give incorrect results. This may be the source of the negative values seen in some NorESM2.5 prototype diagnostic code.

Note: When the old diagnostic package was first adapted to handle SE output, the SE dycore at that time had not yet switched to the dry vertical coordinate. See this JAMES article for more information on that switch.

Working with the old diagnostic package is quite difficult because the incorrect interpolation functionality is used in several files. I think we should re-open the discussion on bringing in a new package and working to make it meet our needs.

MichaelSchulzMETNO commented 3 weeks ago

Thanks for the explanation...My immediate reaction - but there must be a way to convert coordinates. How shall other people in the CMIP world work with this coordinate system?

gold2718 commented 3 weeks ago

Thanks for the explanation...My immediate reaction - but there must be a way to convert coordinates. How shall other people in the CMIP world work with this coordinate system?

Yes. I think the issue is that the current diagnostic scripts do not have any organized place where this could be done. This is because the underlying NCL library has a function that does the interpolation for the FV (and earlier) dycore's vertical coordinate. To do the appropriate interpolation within the current scripts would require refactoring that code to call a new routine that would know about which vertical coordinate was being used.

There are some alternatives, we need to discuss which we want to pursue.

MichaelSchulzMETNO commented 3 weeks ago

I was more thinking of some preprocessing of files, before they enter the diagnostics scripts. How wrong do the diagnostics get if the dry coordinate is assumed to be the wet coordinate? And is the bias possibly always nearly the same for different model versions?

YanchunHe commented 3 weeks ago

I see there is another NCL subroutine, pres_hybrid_ccm_se, for the spectral element coordinate. It also requires hyai and hybi. Is this relevant?

Adding another separate script to determine the grid/coordinate type should be straightforward (and save the grid type as a global environment variable so that other scripts can use).

gold2718 commented 3 weeks ago

I see there is another NCL subroutine, pres_hybrid_ccm_se, for the spectral element coordinate. It also requires hyai and hybi. Is this relevant?

That function dates to 2014, several years before the new SE-dycore vertical coordinate was created. Looking at the code (in contributed.ncl), it is just a wrapper for pres_hybrid_ccm after reading the data using ncol instead of lon/lat. Thus, it has the same flaw as pres_hybrid_ccm.

YanchunHe commented 3 weeks ago

That function dates to 2014, several years before the new SE-dycore vertical coordinate was created. Looking at the code (in contributed.ncl), it is just a wrapper for pres_hybrid_ccm after reading the data using ncol instead of lon/lat. Thus, it has the same flaw as pres_hybrid_ccm.

I see, thanks for the explanation!

gold2718 commented 3 weeks ago

@adagj, @matsbn

oyvindseland commented 3 weeks ago

When I test the old diagnostics script with the added p0 and gw I get a limited amount of output, mostly 2D. For some reason the Taylor diagram works despite using U300. A possible explanation may be that this script use the wind at the grid edges instead of grid center. If so the (FV) vertical definitions are retained.

Example on nird: datalake/NS2345K/oyvinds/NorESM2.5diag/diag (tar file) (same as above)

I have also used a different ncl tool for the aerosol diagnostics, and model vs model intercomparison. NorESM/components/cam/tools/diagnostics/ncl/ModIvsModII

This works with the adjustments defined above but as Steve wrote the vertical plots are are not correctly interpolated. I still think they can be used for a first glance. The 2D plots should be fine at any rate. /datalake/NS2345K/oyvinds/NorESM2.5diag/aerosolandclouds

Main file ModIvsModII.htm

This specific set compares NF2000oslo_alpha_ne30pg3_ne30pg3_mtn14_20240430 year 1-5 with NF1850norbc_aer2014_f19_f19_rel206_20231103/yr1-30/ so 2014 aerosols and release 2.0.6 NorESM2-LM

gold2718 commented 3 weeks ago

I was more thinking of some preprocessing of files, before they enter the diagnostics scripts.

The issue is that the data in the history files is defined on the model pressure levels, that is, the pressure field which is also on the history file (PMID). These levels are not the same as the dycore vertical grid. Currently, the diagnostics ignores the pressure field and assumes the simple Lagrangian hybrid vertical coordinate. I do not know how to recompute all the history fields to be on that grid -- in the model, the dycore does that.

How wrong do the diagnostics get if the dry coordinate is assumed to be the wet coordinate? And is the bias possibly always nearly the same for different model versions?

I do not have any idea how I would estimate that. I am also not sure what "different model versions" is referring to.

oyvindseland commented 3 weeks ago

But pmid varies on a model level. So there has to be some equation if you want to find the values at a certain pressure? If it is not the standard hybrid coordinate equation what is it?

gold2718 commented 3 weeks ago

But pmid varies on a model level. So there has to be some equation if you want to find the values at a certain pressure? If it is not the standard hybrid coordinate equation what is it?

It is not a single equation. The model uses a few different algorithms to interpolate a field to a non-model pressure level:

Linear interpolation: This is the most common approach -- simply do a linear interpolation (in pressure) of the field between the field values above and below the desired output level. This is used in the radiation code as well as most of cam_diagnostics.F90.
Log interpolation: This is a linear interpolation of the field using the log of the pressure above and below the desired output level. This approach is used for the geopotential diagnostic fields in cam_diagnostics.F90.
Extrapolation below lowest model level: CAM uses a couple of ECMWF algorithms to extrapolate below the lowest model level (different formulations are used for extrapolation of geopotential and temperature fields).
Above model top: CAM does not extrapolate above the model top, it merely returns the field value for the topmost model layer.

oyvindseland commented 3 weeks ago

Thank you. For me the most important is to be able to compare CAM6 and CAM7 (NorESM2 and NorESM2.5) on the same vertical grid. If it is pressure, terrain following coordinates or height is less important.

oyvindseland commented 2 weeks ago

One question about the vertical coordinate. I presume that the internal calculation of column burdens in CAM, e.g. total water in the column is calculated on the native SE vertical grid, and not from FV? Radiation and precipitation inside the model is calculated from a loop over the tendencies in the vertical so TOM or surface does not need height information directly, so these parameters will not be impacted.

gold2718 commented 2 weeks ago

I'm not sure I understand the question. Are you talking about diagnostics such as TGCLDCWP? Those are just sums of the values at all vertical levels. Each level represents the amount in a "pressure level" where the value in the PMID variable is the pressure in the middle of that level. Nothing in the physics is computed on the SE native grid.

oyvindseland commented 2 weeks ago

Yes TGCLDLWP is one example. To get a vertically integrated value you also need the pressure (or height) at the level boundaries not only mid point?

gold2718 commented 2 weeks ago

Yes TGCLDLWP is one example. To get a vertically integrated value you also need the pressure (or height) at the level boundaries not only mid point?

No, just add the amounts, e.g. (from cloud_diagnostics.F90):

    do k=1,pver
       tgicewp(:ncol)  = tgicewp(:ncol) + gicewp(:ncol,k)
       tgliqwp(:ncol)  = tgliqwp(:ncol) + gliqwp(:ncol,k)
    end do

Perhaps the confusion is how to compute the amounts to be summed.:

amount(i,k) = mixing_ratio(i,k) * state%pdel(i,k) / g

where state%pdel is the 'thickness' of the level in Pa and g is the gravitational acceleration. This gives the amount in kg m-2.

oyvindseland commented 2 weeks ago

OK. So state%pdel is a variable.

Then it works fine

NorESMhub / CAM