Open rmholmes opened 2 years ago
Yep. The code is quite different though, so it's not obvious what that test should be (until we understand better where the bug is coming from - in which case we probably wouldn't need the test anymore)!
@rmholmes Do you get a zero from this debugging log? https://github.com/COSIMA/libaccessom2/blob/242-era5-support/libforcing/src/forcing_field.F90#L165-L169
The last four lines of work/atmosphere/log/matm...
(before the assert
is triggered and the job stops) are:
{ "cur_exp-datetime" : "1980-01-31T22:00:00" }
{ "cur_forcing-datetime" : "1980-01-31T22:00:00" }
cur_runtime_in_seconds 2671200
{ "forcing_field_update-file" : "INPUT/1980/msdwswrf_era5_oper_sfc_19800101-19800131.nc" }
{ "forcing_field_update-index" : 744 }
So looks reasonable.
I'm utterly baffled by what's happening here: https://github.com/COSIMA/libaccessom2/blob/242-era5-support/libcouple/src/accessom2.F90#L528-L530 versus https://github.com/COSIMA/libaccessom2/blob/242-era5-support/libcouple/src/accessom2.F90#L542-L544
Why would you set the forcing date back to the start but the run date to the end? And why the >=
rather than >
in the second (and maybe the first) block?
Is the first one in order to deal with the RYF forcing (which loops over a specific period of forcing dates)? I agree that it would make more sense if this was >
rather than >=
(although shouldn't make a difference).
The second one does not make sense to me. If the experiment date is larger than the end date then the experiment should already have ended. However, I don't feel that this could be responsible for our error, given that we're no where near the end of the run when the issue with read_data
occurs.
Yes the first one makes it repeat the forcing dataset, however long it might be (RYF, RDF, IAF or whatever). I guess it assumes the first and final forcing times can be identified with one another, hence >=
not >
.
The 2nd one is more mysterious. I guess it's possible to have self%exp_cur_date > self%run_end_date
if the timestep isn't an integer fraction of a day. In most cases setting self%exp_cur_date = self%run_end_date
will have no effect (the run terminates either way) but if the experiment date is a leap day but the forcing date is not, self%exp_cur_date
will be decremented by a day so the model runs for another day:
https://github.com/COSIMA/libaccessom2/blame/17f27949fd3ee554b1a66eb343d1130d7f2632d8/libcouple/src/accessom2.F90#L562
(this was added to resolve https://github.com/COSIMA/access-om2/issues/149)
But I agree with Ryan, I don't think this would cause our error.
@rmholmes is this the fix you used for your latest test run?
Can you provide a link to a commit with your fixed libaccessom2 code so we know what to merge once we're happy with it? Ta!
Yep, it's on this branch: https://github.com/rmholmes/libaccessom2/tree/78-ERA5-netcdf-packing, this specific commit: https://github.com/rmholmes/libaccessom2/commit/bfa2062c1c6004f6d04e39042168b39fb474013a
Great, thanks. Do we understand this code well enough that we're sure this fix doesn't introduce other problems?
There is no evidence of any issues arrising, but I can't say for sure no. I don't understand the code well enough to understand what is going wrong.
However, I think it's clear that what the fix does is prevent the scale_factor
and add_offset
being applied to the data twice - which was previously resulting in completely crazy values (for just one forcing time step). I speculate that the forcing data for that same forcing time step (in between the two months) may still be incorrect. I.e. it could be applying a copy of the forcing from the previous forcing time step again. However, it seems to me that the impact that this kind of error could have on the simulation is very minor.
As described on the ERA-5 forcing issue I think libaccessom2 may have an issue dealing with netcdf unpacking across file boundaries. I'll summarize the problem here.
The problem occurs when transitioning between two months (the ERA-5 forcing is stored in monthly files), best demonstrated by plotting daily minimum wind stress at 92W, 0N from a
1deg_era5_iaf
run spanning 1980-01-01 to 1980-05-01:There is a large burst of negative wind stress in the first day of April in the "raw" run (this causes all sorts of crazy stuff...). The
add_offset
netcdf packing value in the ERA-5 10m zonal winds file is particularly anomalous for March of this year (listed below per month of the files in/g/data/rt52/era5/single-levels/reanalysis/10u/1980/
If I change the netcdf packing values in the single March 1980 10m winds file (using the below python) and rerun, then I remove the burst of wind stress ("Altered packing" run above). This confirms to me that it is a packing issue.
Yes, the packing in the ERA-5 files is weird. But in any case,
libaccessom2
should be able to deal with the variable packing. Xarray in python can, as shown by this plot of the time series of 10m zonal wind at the same point from the original file:I've had a quick look through the code and am none the wiser. As @aekiss said, the netcdf unpacking seems to be handled by the netcdf library, so I don't understand how there can be a problem. Clearly it only affects the times between months when an interpolation has to be done. The rest of the month is fine.