COSIMA / access-om2

ACCESS-OM2 global ocean - sea ice coupled model configurations.
21 stars 23 forks source link

Support ERA5 forcing #242

Open aekiss opened 3 years ago

aekiss commented 3 years ago

This issue is a continuation of an email discussion on making configurations that support ERA5 forcing, e.g. to assess the impact of the forcing dataset on the sea ice simulation.

ERA5 is available on NCI at /g/data/rt52: https://opus.nci.org.au/display/ERA5/ERA5+Community+Home ~[edit: use /g/data/ik11/inputs/ERA5 instead]~

Replacing JRA55-do with ERA5 would require

and changes to

and possibly more, e.g.

Our configurations currently support only JRA55-do forcing: https://github.com/COSIMA/access-om2/tree/master/control but we have some old, unsupported CORE configs here that may be useful as a reference for the changes required https://github.com/COSIMA/1deg_core_nyf https://github.com/COSIMA/025deg_core2_nyf https://github.com/COSIMA/025deg_core_nyf

aekiss commented 2 years ago

Assuming ERA5 is closer to obs, I think these JRA55-do SAT biases go a long way toward explaining the pattern of SIC biases in ACCESS-OM2. [edit: ERA5 is biased warm relative to obs in the Weddell Sea, so the JRA55-do warm bias there is worse than we thought]

aekiss commented 2 years ago

Edit: /g/data/ik11/inputs/ERA5 is no longer necessary

I've put a symlinked copy of ERA5 in /g/data/ik11/inputs/ERA5.

This links to files from /g/data/rt52/era5/single-levels/reanalysis/ unless they are badly chunked, in which case it links to the rechunked files in /g/data/uc0/era5_tmp/single-levels/reanalysis.

So we won't hit any badly-chunked files if we use /g/data/ik11/inputs/ERA5 for the atmosphere input paths in config.yaml.

You must be a member of group rt52 as well as ik11 to access these.

aekiss commented 2 years ago

ERA5 has excessive shortwave radiation south of the polar front, apparently due to poor representation of clouds - see Marc Mallet's talk at ICSHMO. It would be interesting to see if JRA55-do is also biased.

aekiss commented 2 years ago

The badly-chunked files are being fixed by NCI: https://opus.nci.org.au/display/ERA5/Known+Issues

aekiss commented 2 years ago

The chunking issue has now be resolved by NCI https://opus.nci.org.au/display/ERA5/Known+Issues so there's no need to use /g/data/ik11/inputs/ERA5

aidanheerdegen commented 2 years ago

This is why it is bad practice to push directly to master. Should be done through a PR to check it passes ok.

aekiss commented 2 years ago

I've put a 1deg ERA5 repo based on Nic's config here https://github.com/COSIMA/1deg_era5_iaf Testing is underway - expect many rough edges.

rmholmes commented 2 years ago

Thanks @aekiss. I've got it running after fixing two issues:

  1. I don't think we can run with a time step larger than 1 hour / 3600 s, since that's the time resolution of ERA-5.
  2. I couldn't run from the beginning of 1979-01-01 as it threw an error saying forcing couldn't be found, so I'm starting from 1980-01-01.

Here's hoping the output looks ok.

rmholmes commented 2 years ago

Next issue: 1980 is a leap year and it errors looking for *_19800201-19800228 ERA-5 atmosphere files rather than the existing *_19800201-19800229 files.

I note that the JRA-55 files are all grouped in years so this won't have come up. This will probably need some changes to libaccessom2 to deal with end_day properly in atmosphere/forcing.json.

For now I'm continuing with a 1981-1983 run.

aekiss commented 2 years ago

Just a cross-reference for our future selves: We've discovered a problem with the remapping files. This is being addressed here: https://github.com/COSIMA/esmgrids/issues/4#issuecomment-1094521709 and via PR https://github.com/COSIMA/access-om2/pull/258

aekiss commented 2 years ago

Some ERA5 files are missing data for hours 0-6 on 1 Jan 1979, so we need to start the model runs after that, and 1 Jan 1980 is probably neatest. NCI tell me this is an upstream problem, present in the data on ECMWF’s downloading page https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form

The affected files are:

/g/data/rt52/era5/single-levels/reanalysis/10fg/1979/10fg_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/cbh/1979/cbh_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/cin/1979/cin_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/crr/1979/crr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/csfr/1979/csfr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/dctb/1979/dctb_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/deg0l/1979/deg0l_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/dndza/1979/dndza_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/dndzn/1979/dndzn_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/i10fg/1979/i10fg_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/ilspf/1979/ilspf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/kx/1979/kx_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/lsrr/1979/lsrr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/lssfr/1979/lssfr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mbld/1979/mbld_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mcpr/1979/mcpr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mcsr/1979/mcsr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/megwss/1979/megwss_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mer/1979/mer_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/metss/1979/metss_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mgwd/1979/mgwd_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mlspf/1979/mlspf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mlspr/1979/mlspr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mlssr/1979/mlssr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mn2t/1979/mn2t_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mngwss/1979/mngwss_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mntpr/1979/mntpr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mntss/1979/mntss_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mper/1979/mper_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mror/1979/mror_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdrswrf/1979/msdrswrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdrswrfcs/1979/msdrswrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdwlwrf/1979/msdwlwrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdwlwrfcs/1979/msdwlwrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdwswrf/1979/msdwswrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdwswrfcs/1979/msdwswrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msdwuvrf/1979/msdwuvrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mser/1979/mser_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mslhf/1979/mslhf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msmr/1979/msmr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msnlwrf/1979/msnlwrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msnlwrfcs/1979/msnlwrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msnswrf/1979/msnswrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msnswrfcs/1979/msnswrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msr/1979/msr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msror/1979/msror_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/msshf/1979/msshf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mssror/1979/mssror_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mtdwswrf/1979/mtdwswrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mtnlwrf/1979/mtnlwrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mtnlwrfcs/1979/mtnlwrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mtnswrf/1979/mtnswrf_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mtnswrfcs/1979/mtnswrfcs_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mtpr/1979/mtpr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mvimd/1979/mvimd_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mx2t/1979/mx2t_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/mxtpr/1979/mxtpr_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/ptype/1979/ptype_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/tcslw/1979/tcslw_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/totalx/1979/totalx_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/tplb/1979/tplb_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/tplt/1979/tplt_era5_oper_sfc_19790101-19790131.nc
/g/data/rt52/era5/single-levels/reanalysis/zust/1979/zust_era5_oper_sfc_19790101-19790131.nc
rmholmes commented 2 years ago

As mentioned in https://github.com/COSIMA/esmgrids/issues/4 the remapping weights for the ERA-5 files are now fixed. My first test run with these fixed remapping weights ran from 1981-01 to 1982-10-31 and then crashed with temperatures out of range. Some initial analysis compared to an equivalent JRA-55 run is located here. It is looking reasonable, except that in regions of strong river run-off the SSS is getting way to salty (it's saturated below, I'm getting differences up to 10g/kg): Capture

So there may be a problem with the run-off included from JRA-55? I haven't had time to investigate yet.

These locations don't seem to correspond to the temperature out of range locations - so there may be multiple issues. Dropping the time step from 3600s to 1800s doesn't seem to help.

I also made a start on generating RYF ERA-5 files (as I think this could be easier for testing) here

I am off on leave for 2 weeks but will revisit this when I get back.

rmholmes commented 2 years ago

UPDATE: I've successfully generated some RYF9091 ERA-5 files using this repo. They are located at /g/data/e14/rmh561/access-om2/input/ERA-5/RYF. If anyone wants to test them while I'm away please go ahead.

rmholmes commented 2 years ago

I have now successfully run 6 years of an ERA-5 RYF9091 simulation. See /home/561/rmh561/access-om2/1deg_era5_ryf/ (and equivalent jra55 at /home/561/rmh561/access-om2/1deg_jra55_ryf/).

It runs very slowly (about 10 slower than JRA-55 and about 2 slower than ERA-5 IAF run). Perhaps this is because my new RYF files are a single file for the whole year, instead of separate files for each month, and the chunking is bad? The slow-down is definitely associated with reading in the atmospheric data (looking at work/atmosphere/log/matamm... during a run). But I have not investigated any further.

The runoff is still the main obvious problem with the solution. Comparing mror from ERA-5 (R below) to friver from JRA-55 (L below), which both supposedly have the same units: Runoff_Comparison

So I don't think mror is the right ERA-5 variable to use. The magnitude is also way too small, explaining my salinity differences shown above.

So either we look for another ERA-5 variable that is appropriate (one might not exist?), or we use JRA-55 runoff. Thoughts @aekiss?

rmholmes commented 2 years ago

Notes from TWG meeting discussion 11/05/2022:

  1. Yes, we will revert to using the JRA-55 friver run-off instead of mror.
  2. The chunking on my RYF files is very bad for time. I will look into altering this chunking so that the time chunks are equal to or smaller than the cache size.
rmholmes commented 2 years ago

@aidanheerdegen rewrote the make_ryf script to work efficiently with ERA-5 by improving the chunking and also removing compression (see https://github.com/aidanheerdegen/make_ryf/tree/ERA5). I now have the following 1 year run-times:

So ERA-5 is slower but this probably won't persist through to higher resolution.

I've run ERA-5 RYF for 10 years and it seems to be stable. The IAF starting in 1980 crashes on 1982-11-01 with temperature out of range. This is the same date as my older run. Perhaps there is a storm on this date causing an issue?

Reverting to JRA-55 friver from mror removes the large salinity biases as expected. Both runs are now looking reasonable to me with a cursory look. See https://github.com/rmholmes/cosima-scripts/blob/master/ERA-5/ERA-5_Initial_Analysis-RYF.ipynb and https://github.com/rmholmes/cosima-scripts/blob/master/ERA-5/ERA-5_Initial_Analysis-IAF.ipynb. E.g. global average temp is pretty stable: Capture

Circulation metrics are remarkably (almost suspiciously...) consistent. E.g. Drake passage: Capture

SST and SSS look ok: Capture

ERA-5 is warmer than JRA-55 in summer in the polar regions. There are some sea-ice differences as well (see bottom SI area and volume plots in scripts) - which someone with more experience in that area should look into.

For those with interest, the run outputs are in /g/data/e14/rmh561/access-om2/archive/, (1deg_jra55_ryf, 1deg_jra55_iaf, 1deg_era5_ryf, 1deg_era5_iaf), with a database at /g/data/e14/rmh561/access-om2/archive/databases/cc_database_era5.

Since the runs are looking pretty good. I'm going to run another 90 years of the jra55_ryf and era5_ryf to see how they go. Otherwise, I'll leave this alone for a week or so, apart from the following note on the 1/4-degree.

rmholmes commented 2 years ago

I tried putting together a 025deg_era5_ryf configuration (https://github.com/rmholmes/025deg_jra55_ryf/tree/ERA5). However, I ran into an oasis error with the remapping files. I think this could be because Nic generated a new rmp_jrar_to_cict_CONSERV.nc for the 1-degree (in /g/data/ik11/inputs/access-om2/input_20210915/common_1deg_era5/), but not for the 1/4-degree. Since we're using JRA-55 run-off I assumed I could just use the older JRA-55 rmp_jrar_to_cict_CONSERV.nc but that doesn't appear to be the case as I get the error MCT::m_SparseMatrixPlus:: FATAL--length of vector y different from row count of sMat.Length of y = 108000 Number of rows in sMat = 1555200 (unless this error is coming from my new ERA-5 forcing remapping file /g/data/e14/rmh561/access-om2/input/ERA-5/remap/prod/ERA5_MOM025_patch.nc, but I don't see why since I used the same process to generate it as for the 1-degree).

StephenGriffies commented 2 years ago

Sorry for jumping in late on this very long thread.

When running with JRA55-do, we generally use the bulk formula from NCAR, as per the Tsujino et al papers. What about when running with ERA? I believe they require different bulk formula from ECMWF. Also, I believe the NCAR and ERA bulk formulas expect winds at different height. Have all these points been handled and documented?

Later in 2022 we hope to be comparing ERA and JRA, so any summary documentation of the technical points would be great.

rmholmes commented 2 years ago

@StephenGriffies I believe we are using the same Large and Yeager (2004) bulk formula for both at this point. Winds are at 10m for both. For these initial tests we have also not shifted the ERA-5 temp from 2m to 10m (see @aekiss's notes above on this). I'm just doing some initial tests to make sure we can get things running and that the results don't look too crazy. These other issues are still to be addressed.

ofa001 commented 2 years ago

I have a student ( more in the atmosphere -Ice field) who has some interesting results with respect to JRA-55 v ERA5 fluxes perhaps similar to the plots that aekiss put in plot form on 4feb on this thread, I have suggested at some point he should present his results to COSIMA, and based on Ryan's preliminary comments it could be relevant for interpretation of the differences in the runs. He is busy writing up at the moment but I will see if we can get him to give a talk in the not to far future.

rmholmes commented 2 years ago

I am out of time to work on this further, and will not get back to it until November.

For reference, my latest configurations are on github at https://github.com/rmholmes/1deg_era5_ryf, https://github.com/rmholmes/025deg_era5_ryf and https://github.com/rmholmes/1deg_era5_iaf. Output/restarts are in /g/data/e14/rmh561/access-om2/archive/ with same folder names (and equivalent jra55 runs). Latest analysis is at https://github.com/rmholmes/cosima-scripts/tree/master/ERA-5. Some notes on progress are at https://github.com/rmholmes/1deg_era5_iaf/blob/master/Notes.org

The outstanding issues:

aekiss commented 2 years ago

Further to my comment above, I've just learned from @StephenGriffies that the ERA40-based DRAKKAR Forcing Set (DFS) versions 3 and 4 (Brodeau et al. 2010) have been superseded by DFS5, which is based on ERA-Interim: https://www.drakkar-ocean.eu/publications/reports/report_DFS5v3_April2016.pdf. Not sure if there are any efforts to do the same with ERA5. [Edit: plans are underway for an ERA5-based model forcing dataset]

aekiss commented 2 years ago

ERA5 support in libaccessom2 has been moved out of master into the 242-era5-support branch until this issue is fixed https://github.com/COSIMA/libaccessom2/issues/75

rmholmes commented 1 year ago

I am working again on the ERA-5 configs. First task is to try to get the IAF simulations working.

My 1deg_era5_iaf config starting in 1980 crashes within the first time step of November 1982 due to temperature out of range. Every time-step tracer diagnostics shows that the large temperatures develop immediately as the forcing swaps from October to November (ERA-5 has monthly forcing files). Runs I did earlier this year starting from other years also always crashed on the first day of a month. So it seems that the problem occurs when the forcing swaps from one file to the next. Furthermore, there is a warning immediately before the crash of large zonal wind stresses (-11Nm-2).

I'm wondering whether the crash could have something to do with the netcdf packing? In the ERA-5 files the add_offset and scale_factor values differ each month. In particular, for the 1982 10m wind velocity I have monthly add_offset values of:

u10:add_offset = -1.10877391482281 ;
u10:add_offset = -1.1215841955099 ;
u10:add_offset = 0.0125949346145859 ;
u10:add_offset = -1.53913651675738 ;
u10:add_offset = -2.75968769557738 ;
u10:add_offset = -2.13865433273629 ;
u10:add_offset = -1.76729693885812 ;
u10:add_offset = -3.29618059715136 ;
u10:add_offset = -16.5649987171096 ;
u10:add_offset = -57.4015155174217 ;
u10:add_offset = -2.02439977535769 ;
u10:add_offset = 0.954945670431101 ;

The jump from October to November 1982 is particularly large. Ncviewing the files shows this jump (for a random point in the Pacific): ncview_output_u10

I presume ncview just takes add_offset and scale_factor from the first month, so is obviously wrong. Xarray on the other hand does this unpacking fine. I had assumed that ACCESS-OM2 does too, but the largeness of the jump going from October to November corresponding exactly with when I get the crash is suspicious.

It may also be possible that the add_offset factor changes are instead a symptom of something strange in the wind stress data (but I can't see anything weird in ncview).

Anyone have any thoughts? Is it possible that there could be an issue in libaccessom2 with the netcdf unpacking? Or am I barking up the wrong tree? The fact that the crashes always happen on the first time step of the month is mightily suspicious to me.

AndyHoggANU commented 1 year ago

I think it’s worth testing. You could do a simple test by formatting the offending files?

aekiss commented 1 year ago

Hm, intriguing... Have you looked at daily surface stress output from the model to see if this also has these jumps?

~There's no mention of scale_factor or add_offset in the libaccessom2 code, which makes me suspect they're handled internally by the netCDF library, so presumably done correctly?~ correction: they are in libutil/src/util.F90 on the 242-era5-support branch If scale_factor or add_offset weren't handled at all we'd have much bigger problems, as the underlying raw data is 16-bit integer.

If the stress also shows these jumps I'll dig into the code a bit deeper to see whether it's only reading the scale_factor and add_offset metadata once.

rmholmes commented 1 year ago

I haven't output daily surface stress (and as the run crashes I can't look at the output for the current run). However, I did find some similar behavoir earlier in the run. For example, there's a big jump in daily minimum SST going from March to April 1980 (the model kept going through this):

MinTemp

Here, similarly, we have a big change in add_offset and scale_factor in, say, 10m winds (which ncview shows but xarray deals with fine):

10u_MarApr1980

I'm wondering whether this is only an issue for the time steps inbetween the last forcing time in March 1980 and the first forcing time in April 1980 (you can see the recovery in the min-daily-SST above over 5 days or so). The model has to interpolate the forcing inbetween these times - is it possible it's using the wrong unpacking values during this step? But for the 1-degree model the forcing time step and model time step are both one hour - so no interpolation should be needed.

Reformating the files as a test sounds like a good idea, but I'll have to do this for every file for one particular month (e.g. April 1980).

rmholmes commented 1 year ago

It's actually a pretty massive perturbation, I'm surprised the model survives it. See global KE and SST scalars:

totalKE_and_global_surface_temp

rmholmes commented 1 year ago

Final plot: I do think this is a burst of super strong zonal wind. Comparing daily average surface zonal velocity for the last day of March and first day of April:

usurf

The resulting perturbations generate a bunch of waves that radiate away and dissipate over the next 5 days or so.

aekiss commented 1 year ago

OMG! it's amazing that it runs at all

rmholmes commented 1 year ago

Yep, here's daily min zonal wind stress from the model output, showing a global burst of almost -5Nm-2 wind stress, just on the 1st of April (days either side are fine): Daily_min_tau_x

Given I've narrowed it down to 10u on the 1st of April, I'll have a go at changing the netcdf packing in the April 1980 10u file..

aekiss commented 1 year ago

so are 10v and all the other inputs unaffected?

rmholmes commented 1 year ago

I think I can confirm that the issue is the netcdf packing. I ran exactly the same 4-month simulation (Jan-May 1980) where I replaced the single 10u ERA-5 forcing file for March 1980 (the one with a particularly anomalous add_offset of -32.1) with a version that uses the same add_offset and scale_factor as in the April 1980 file. I did this using:

file_in = '/g/data/rt52/era5/single-levels/reanalysis/10u/1980/10u_era5_oper_sfc_19800301-19800331.nc'
file_out = '/g/data/e14/rmh561/access-om2/input/ERA-5/IAF/10u/1980/10u_era5_oper_sfc_19800301-19800331.nc'
DS = xr.open_dataset(file_in)
encoding = {}
scale = 0.000966930321007164 # Apr 1980 value
offset = -0.761652898754254 # Apr 1980 value
encoding['u10'] = {'scale_factor': scale, 'add_offset': offset, 'dtype': 'int16'}
DS.to_netcdf(file_out,encoding=encoding)

This removes the big burst in zonal wind stress, giving a reasonable total KE plot:

KE_tot

Note: There is still a jump in tau_x, surface_pot_temp etc. between March and April, it's just much smaller (**But why should their be now, I've set the add_offset and scale_factor values to be the same!).

so are 10v and all the other inputs unaffected?

Every file for every variable has a different scale_factor and add_offset. So if the error is with netcdf packing then it all has to be affected (thus - there are small shocks in everything at every month transition?). I think that the difference is much larger for 10u in these two months.

I'm not sure I have the experience to jump into the libaccessom2 code and figure out what is going on...

AndyHoggANU commented 1 year ago

I think this is something we should raise with NCI, perhaps through Yiling or Andrew.

I have heard from speaking to some of the atmospheric folks that the ERA5 data stored on NCI is differently processed from the data you can download from the main ECMWF repository. That also happened with the badly chunked data, I think. Perhaps that is the problem here? Is it worth downloading a few native ERA5 files to gather information before going to NCI? In short, if this is a problem with the data files it is better to work with NCI than to alter the code. If it's the same in ERA5 then we will have to do something to the way we load netcdf files in libaccessom2 ...

rmholmes commented 1 year ago

I think this is something we should raise with NCI, perhaps through Yiling or Andrew.

I agree, I think this should be raised with NCI once we have a better idea of what is going on.

However, I still feel like it's pointing at a potential bug in our system. Fundamentally the netcdf packing shouldn't affect our simulations (except to the XXth decimal place). The fact that it only comes up in the transition between months is supportive of a bug. Who knows if this is also affecting JRA-55 files in the transition from year to year, but not to the point of us noticing!

aekiss commented 1 year ago

If they open ok in xarray then that points to a bug in libaccessom2.

aekiss commented 1 year ago

An ERA-5 based model forcing dataset is being planned, to replace JRA55-do.

From Alistair:

As you probably know, JRA55-do will no longer be updated starting sometime this year. Moving forward, as was just announced by Gustavo Marques at the DRAKKAR meeting, there is a plan to develop a new ocean forcing dataset based on ERA-5. NCAR will collaborate with GFDL and ECMWF to produce this product and NCAR is currently looking for some funds to start the effort. ERA-5 meets all the criteria for a new upstream product and ECMWF has just completed a back-extension to 1940. ECMWF plans to routinely update ERA-5 until after ERA-6 is made available, which is scheduled for 2026. It is not clear how long it will take to produce the adjustments and new ocean-forcing dataset but there is a sense of urgency to minimize the gap once JRA55-do is discontinued.

rmholmes commented 1 year ago

Interesting. Seems like it would be worth waiting for them to do this properly given our limited resources for this problem.

StephenGriffies commented 1 year ago

Your experiences, particularly creating a repeating year, will be of great use when this project starts in earnest. We are currently trying to garner funds to support someone at NCAR to do the heavy lifting. We will be in touch, likely through CLIVAR OMDP (@adele157 is on that panel).

rmholmes commented 1 year ago

Attached summary slides from COSIMA meeting discussion today 23_02_16_ERA-5_status.pdf

StephenGriffies commented 1 year ago

It would be nice to have this document expanded a bit to be like Farneti's MOM5-JRA55-do document that he just put together for this webpage

https://mom-ocean.github.io/docs/

That way, those outside COSIMA can benefit. This documentation of ERA5 is particularly important given the nascent efforts to build ERA55-do.

access-hive-bot commented 1 year ago

[editing this bot comment]: The minutes from today's COSIMA meeting where we discussed ERA-5 in more detail can be found here:

https://forum.access-hive.org.au/t/cosima-meeting-minutes-2023/407/3

In particular, note the interesting talk by Zhaohui on temperature biases in reanalyses (including ERA-5 and JRA-55) over Antarctica.

rmholmes commented 1 year ago

@StephenGriffies these runs are still a work in progress but yes, I agree that they should be expanded once we have a better idea what is going on.

aekiss commented 1 year ago

ERA5 SAT is biased warm in the Weddell Sea, according to King et al., JGR 2022.

This bias is small at temperatures close to 0°C but reaches 5-10°C at -40°C.

If this is the case, the SAT adjustments between JRA55 and JRA55-do are even more excessive than we thought in the Weddell Sea, since JRA55-do is warmer than ERA5 there.

ofa001 commented 1 year ago

Yes @aekiss I am aware of John Kings work, was talking to him before he published that paper, Zhaohui Wang also looked at some of the AWI buoy data in the next chapter of his thesis but some of it was included in the ERA5 reanalysis, so it wasn't independent data, we did look for some but we couldn't find any for the 2018 year he focussed his detailed WRF simulation on.

aekiss commented 1 year ago

I have Zhaohui to thank for sending me that reference :-)

rmholmes commented 1 year ago

A quick update for this thread: We (actually @russfiedler) have found the issue with the IAF forcing/netcdf packing (https://github.com/COSIMA/libaccessom2/issues/78). These is something wrong with the timing of when data is read into the cache that was interacting with the netcdf packing (scale/offset) in ERA-5 to cause very large wind stress bursts at the beginning of certain months. There is an underlying issue here which also likely affects the JRA55 runs (although likely not to a large degree - TBC).

In any case, knowing what the issue is I've put in a quick fix for the ERA-5 netcdf packing issues and so now I can successfully run forward with the IAF simulations. I've run from 1959 through to 1979 where I run into an issue with missing ERA-5 data that I've alerted NCI to. Hopefully once this gets fixed we'll have something to look at.

rmholmes commented 1 year ago

Initial analysis of 1-degree 1980-2019 and 1959-1979 runs (compared to equivalent JRA-55 runs) are here: https://github.com/rmholmes/cosima-scripts/blob/master/ERA-5/ERA-5_Initial_Analysis-IAF.ipynb It's all looking good from this cursory analysis. A short summary:

It would be great to get more eyes on this, in particular on the sea ice. In principle, there is nothing standing in the way of equivalent runs being setup at 1/4-degree (and even 1/10-degree), in addition to the RYF runs I've already done.

Still waiting on NCI to fix the issues with the 1979 missing data.

One issue: If simulations are needed up to present day (which I believe many in the sea ice community are interested in) then something needs to be done about the run-off. We're using JRA55 run-off which is only available up until end of 2018 (for JRA55 v1.4) or, I believe, 07/2020 for JRA55 v1.5.

aekiss commented 1 year ago

Great, thanks @rmholmes

We're using JRA55 run-off which is only available up until end of 2018 (for JRA55 v1.4) or, I believe, 07/2020 for JRA55 v1.5

JRA55-do v1.5.0.1 runs from 1 Jan 2020 to 4 days behind the present day, and I'm keeping a mirror fairly up-to-date here: /g/data/ik11/inputs/JRA-55/JRA55-do-1-5-0-1

Data for the final 30 days are preliminary, and replaced day-by-day by final, higher-quality data.

rmholmes commented 1 year ago

Ah ok thanks @aekiss. That's not an issue then (at least until JRA-55 stops being updated). Following ERA-5 simulations should use JRA55 v1.5 run-off to solve this.