Closed AndyHoggANU closed 5 years ago
Yes, in this case it is the issue with using the incorrect calendar, gregorian
, rather than proleptic_gregorian
with time units of "days since 0001-01-01"
.
I have confirmed it works as it should if the data is input without decoding the times, the calendar changed to proleptic_gregorian
and then the times decoded.
In a way this is a problem with the files themselves that needs to be changed. How many are affected like this?
Ok, I've added a --calendar
option to splitvar
https://github.com/coecms/splitvar/commit/d6960f7b7e4b96e68da74b315cddf2ce6e718b65
I can either add the option to the README
scripts and push, or you can add and try it out. If the latter, add the following option
--calendar proleptic_gregorian
and give it a burl (I have updated splitvar
in conda/analysis3-unstable
)
Hmmm ... the plot thickens. This works for the 3D variables. If you compare
darray = xr.open_dataset('/g/data/ua8/cosima-tmp/publish/access-om2-025/ocean/salt/salt_access-om2-025_219801_219901.nc')
darray.time_bounds
which gives
<xarray.DataArray 'time_bounds' (time: 1, nv: 2)>
array([['2198-01-01T00:00:00.000000000', '2199-01-01T00:00:00.000000000']],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2198-07-02T12:00:00
* nv (nv) float64 1.0 2.0
Attributes:
long_name: time axis boundaries
to
darray = xr.open_dataset('/g/data/ua8/cosima-tmp/publish/access-om2-025-old/ocean/salt/salt_access-om2-025_219712_219812.nc')
darray.time_bounds
which gives
<xarray.DataArray 'time_bounds' (time: 1, nv: 2)>
array([['2197-12-30T00:00:00.000000000', '2198-12-30T00:00:00.000000000']],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2198-06-30T12:00:00
* nv (nv) float64 1.0 2.0
Attributes:
long_name: time axis boundaries
you will see that the time bounds have been fixed.
Foncusingly, it doesn't seem to work with ocean_month.nc
:
darray = xr.open_dataset('/g/data/ua8/cosima-tmp/publish/access-om2-025/ocean/mld/mld_access-om2-025_219801_219901.nc')
darray.time_bounds
gives
<xarray.DataArray 'time_bounds' (time: 12, nv: 2)>
array([['2256-12-30T00:00:00.000000000', '2257-01-30T00:00:00.000000000'],
['2257-01-30T00:00:00.000000000', '2257-02-27T00:00:00.000000000'],
['2257-02-27T00:00:00.000000000', '2257-03-30T00:00:00.000000000'],
['2257-03-30T00:00:00.000000000', '2257-04-29T00:00:00.000000000'],
['2257-04-29T00:00:00.000000000', '2257-05-30T00:00:00.000000000'],
['2257-05-30T00:00:00.000000000', '2257-06-29T00:00:00.000000000'],
['2257-06-29T00:00:00.000000000', '2257-07-30T00:00:00.000000000'],
['2257-07-30T00:00:00.000000000', '2257-08-30T00:00:00.000000000'],
['2257-08-30T00:00:00.000000000', '2257-09-29T00:00:00.000000000'],
['2257-09-29T00:00:00.000000000', '2257-10-30T00:00:00.000000000'],
['2257-10-30T00:00:00.000000000', '2257-11-29T00:00:00.000000000'],
['2257-11-29T00:00:00.000000000', '2257-12-30T00:00:00.000000000']],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2257-01-14T12:00:00 ... 2257-12-14T12:00:00
* nv (nv) float64 1.0 2.0
Attributes:
long_name: time axis boundaries
Which still has the two-day offset.??.
I have pushed my latest code for you to see or confirm you can reproduce...
A couple of points:
How does mld_access-om2-025_219801_219901.nc
give dates like 2256-12-30T00:00:00.000000000
?
I can't reproduce. Tried this:
splitvar -cp -d title -d grid_type -d grid_tile -a ocean_grid.nc -o $OUTPATH --model-type ${SUBMODEL} --simname ${MODEL} --calendar proleptic_gregorian -v sea_level ${COSIMADIR}/${MODEL}/${EXPT}/output1[2-5]?/${SUBMODEL}/ocean_month.nc
got this:
>>> ds = xr.open_dataset('datadir/access-om2-025/ocean/sea-level/sea-level_access-om2-025_219801_219901.nc')
>>> ds.time_bounds
<xarray.DataArray 'time_bounds' (time: 12, nv: 2)>
array([['2198-01-01T00:00:00.000000000', '2198-02-01T00:00:00.000000000'],
['2198-02-01T00:00:00.000000000', '2198-03-01T00:00:00.000000000'],
['2198-03-01T00:00:00.000000000', '2198-04-01T00:00:00.000000000'],
['2198-04-01T00:00:00.000000000', '2198-05-01T00:00:00.000000000'],
['2198-05-01T00:00:00.000000000', '2198-06-01T00:00:00.000000000'],
['2198-06-01T00:00:00.000000000', '2198-07-01T00:00:00.000000000'],
['2198-07-01T00:00:00.000000000', '2198-08-01T00:00:00.000000000'],
['2198-08-01T00:00:00.000000000', '2198-09-01T00:00:00.000000000'],
['2198-09-01T00:00:00.000000000', '2198-10-01T00:00:00.000000000'],
['2198-10-01T00:00:00.000000000', '2198-11-01T00:00:00.000000000'],
['2198-11-01T00:00:00.000000000', '2198-12-01T00:00:00.000000000'],
['2198-12-01T00:00:00.000000000', '2199-01-01T00:00:00.000000000']],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2198-01-16T12:00:00 ... 2198-12-16T12:00:00
* nv (nv) float64 1.0 2.0
Attributes:
long_name: time axis boundaries
>>>
You are right - I must have read in the wrong file somehow. Sorry. It is now perfect.
I have noticed that splitvar saves dates in the ocean files with an offset of 2 days. I think. To see this, load a data file, any file, like this:
The time array on this looks OK, it is:
array([['2256-12-30T00:00:00.000000000', '2257-01-30T00:00:00.000000000'], ['2257-01-30T00:00:00.000000000', '2257-02-27T00:00:00.000000000'], ['2257-02-27T00:00:00.000000000', '2257-03-30T00:00:00.000000000'], ['2257-03-30T00:00:00.000000000', '2257-04-29T00:00:00.000000000'], ['2257-04-29T00:00:00.000000000', '2257-05-30T00:00:00.000000000'], ['2257-05-30T00:00:00.000000000', '2257-06-29T00:00:00.000000000'], ['2257-06-29T00:00:00.000000000', '2257-07-30T00:00:00.000000000'], ['2257-07-30T00:00:00.000000000', '2257-08-30T00:00:00.000000000'], ['2257-08-30T00:00:00.000000000', '2257-09-29T00:00:00.000000000'], ['2257-09-29T00:00:00.000000000', '2257-10-30T00:00:00.000000000'], ['2257-10-30T00:00:00.000000000', '2257-11-29T00:00:00.000000000'], ['2257-11-29T00:00:00.000000000', '2257-12-30T00:00:00.000000000']], dtype='datetime64[ns]')