Closed aekiss closed 1 year ago
It's a long shot, but the Orinoco outflow (white) peaks about when the crash occurs. We might need to set a regional runoff cap in atmosphere/atm.nml
. But it's less than the Amazon runoff (red) so might not be a problem. Plot below is from /g/data/qv56/replicas/input4MIPs/CMIP6/OMIP/MRI/MRI-JRA55-do-1-5-0/land/day/friver/gr/v20200916/friver_input4MIPs_atmosphericState_OMIP_MRI-JRA55-do-1-5-0_gr_19840101-19841231.nc
.
Thanks @aekiss.
Runoff is a good candidate as going to JRA55 v1.5 runoff is the only thing that has changed in the forcing compared to the successful 1deg_era5_iaf
simulation (using v1.4 runoff) I've done before. I just plotted some maximums of friver
and licalvf
and nothing looks crazy. But it's worth trying. Do you have an example of a runoff cap?
We use runoff caps in the 0.1° configs: https://github.com/COSIMA/01deg_jra55_iaf/blob/master/atmosphere/atm.nml See documentation in code: https://github.com/COSIMA/libaccessom2/blob/d750b4bfdc58c59490985c682c1b4c56cc1016b1/atm/src/runoff.F90#L24-L35
Is it reasonable to just turnoff runoff altogether as a quick test to see if this is really what is causing the problem?
The relevant restart tiles from the day before are
/home/561/rmh561/access-om2/025deg_era5_iaf/archive/restart005/ocean/*.0015
. They seem OK in this location so it may be a red herring. e.g. these plots from ocean_sbc.res.nc.0015
. Note that sea_lev
is high in the outflow region rather than low.
Is it reasonable to just turnoff runoff altogether as a quick test to see if this is really what is causing the problem?
yeah I guess we could try that. Or try JRA55-do v1.4 runoff. Or try 1deg_era5_iaf
with JRA55-do v1.5 runoff. But did the runoff change between these versions of JRA55-do?
This is not the only difference between the 1° and 0.25° configs, as the higher resolution means that runoff is more concentrated and CFL values are different etc etc
There's no mention of runoff differences between JRA55-do v1.4 and v1.5 here https://climate.mri-jma.go.jp/pub/ocean/JRA55-do/ or here https://climate.mri-jma.go.jp/pub/ocean/JRA55-do/docs/v1_5-manual/User_manual_jra55_do_v1_5.pdf
There's checkerboarding suggestive of an overly-long barotropic timestep in archive/restart005/ocean/ocean_barotropic.res.nc.0015
(with ice_ocean_timestep = 300
and barotropic_split = 80
). But since it still crashes with a much shorter timestep this is probably a red herring.
The Free surface penetrating rock
error is triggered here.
https://github.com/mom-ocean/MOM5/blob/9b8ec93/src/mom5/ocean_core/ocean_thickness.F90#L3380-L3398
It's weird that we don't also get an Error from ocean_thickness: Surface undulations too negative; model unstable
message specifying the location of the offending grid point - that would be helpful information to have.
I now know where this is coming from. There is a massive spike in the meridional wind from the ERA-5 data just south of Papua New Guinea that starts on 1984-08-11T15:00:00
and lasts until 1984-08-11T21:00:00
(in the file /g/data/rt52/era5/single-levels/reanalysis/10v/1984/10v_era5_oper_sfc_19840801-19840831.nc
). Up to about 130ms-1. It's obvious in the following image causing some clear radiating waves in the wind field.
I guess the 1-degree model survived it, but the 1/4-degree can't.
This must be an upstream ERA-5 problem. @aekiss any suggestions here? Maybe using the scaling system to scale down the winds for this 6 hour period? Do you have an example you can point me at to do this?
Ah, well spotted @rmholmes! Looks like a bad observation slipped through the quality control and was incorporated into the reanalysis.
Yes, scaling would be the way to go - here's what I did last time we had a problem like this - I scaled down the winds using a spatiotemporal Gaussian to ensure smoothness in space and time: https://github.com/COSIMA/access-om2/wiki/Tutorials#Scaling-the-forcing-fields https://github.com/aekiss/notebooks/blob/master/make-jra55-scaling.ipynb
In your case the other fields (not just wind) might be worth checking/fixing, as they will be dynamically linked to the bad wind via the reanalysis.
Thanks @aekiss. I'll give it a burn.
Weird, it doesn't appear in u10
:
This fix worked fine so closing this issue (my scaling file is at https://github.com/rmholmes/cosima-scripts/blob/master/ERA-5/025deg_era5_iaf_v10_blowup_scaling.ipynb). I'll continue running this hopefully up until near real time, and will report back.
I found another bad point, this time 1986-07-24T09:00:00
, near the Ross Sea. It causes another blow-up:
I'll do the same thing.
Wow, that's a doozy! Looks like they should tweak the QC in data ingestion...
Yeah. This one's funny because it actually completely disappears in the hour following the one I'm showing there.
Now I know what to look for hopefully I can get it through to the end of the cycle!
Thanks, it's really valuable having these landmines mapped out before we run at higher resolution or with ACCESS-OM3.
Found a third one, 13-Nov-1992T19:00:00 in a similar location to the first one:
Thanks @rmholmes. The points you're finding correspond to the known issues in table 2 here https://confluence.ecmwf.int/display/CKB/ERA5%3A+large+10m+winds
There are lots of spurious values, but evidently most aren't bad enough to trip up the model. The worst ones are up to 300m/s (Mach 0.87)!
A few times per year, the analysed low level winds, eg the 10m winds, become unrealistically large in a particular location, which varies amongst a few apparently preferred locations. The largest values seen so far are about 300 ms-1. This problem occurs towards the end of the data assimilation windows (9-21 UTC and 21-9 UTC) because of an instability in the analysis method.
I guess that explains why the Ross Sea doozy of 1986-07-24T09:00:00 suddenly vanishes (new assimilation cycle)?
From 19 February 2020 onwards, the ERA5 system has examined the 10m wind components and if the magnitude of either component exceeds 50 ms-1, then the analysed parameters are replaced with the "4v" parameters.
Hopefully the 50m/s cutoff post- 19 February 2020 will help with model stability. A cutoff seems an odd approach though, especially if applied to only one component and one grid point - it will mess up convergence and curl. The method we're using tries to minimise this problem by scaling both components by a factor that is smooth in space and time. That seems reasonable for toning down an otherwise-reasonable storm, but maybe that's worse in the case of an isolated bad point because also scales data that is mostly ok (other than gravity waves).
Thanks for finding that @aekiss. Makes sense that someone has found these before. I'll continue on as I'm doing, checking these tables if I run into any other problems.
For now, this seems a better approach than replacing with the "4v" parameters?
I am continually impressed by what the model can cope with without blowing up.
@rmholmes is getting a
Free surface penetrating rock
error just after 1984-08-11T19:00 in a 0.25° config forced by ERA5 and JRA55-do v1.5 runoff. This does not occur with the 1° version of this config.Errors like this have been resolved in the all-JRA55-do configs by reducing the timestep (e.g. from 540s to 360s to fix a crash just after 1988-09-27T06:00 in ACCESS-OM2-01 IAF), but Ryan has tried reducing dt from 1200s to 100s to no avail.