Closed mmazloff closed 6 years ago
Could you post the output of
ncdump -h
on the netcdf file that comes out of python?
On Mar 13, 2018, at 5:11 PM, Matt Mazloff notifications@github.com wrote:
Hello
I am writing variables to netcdf and finding that the time is being read in Matlab as the iteration number instead of time in seconds.
Here is what I do
from xmitgcm import open_mdsdataset grid_dir = 'BIN2/' data_dir = 'OUTPUT/' ds = open_mdsdataset(data_dir, grid_dir, iters='all', delta_t=3600, ignore_unknown_vars=False, prefix=['diag_state'], ref_date="2008-1-1 0:0:0",geometry='sphericalpolar')
print(ds)
Dimensions: (XC: 1080, XG: 1080, YC: 315, YG: 315, Z: 52, Zl: 52, Zp1: 53, Zu: 52, time: 60) Coordinates: iter (time) int64 dask.array time (time) datetime64[ns] 2008-01-31T10:00:00 2008-03-01T20:00:00 ... XC (XC) float32 0.166667 0.5 0.833333 1.16667 1.5 1.83333 2.16667 ... YC (YC) float32 -77.9652 -77.8956 -77.8255 -77.755 -77.6841 ... ... THETA (time, Z, YC, XC) float32 dask.array ds['THETA'].to_netcdf('Theta.nc') Note time is correct here: ds.time array(['2008-01-31T10:00:00.000000000', '2008-03-01T20:00:00.000000000', '2008-04-01T07:00:00.000000000', '2008-05-01T17:00:00.000000000', '2008-06-01T03:00:00.000000000', '2008-07-01T13:00:00.000000000', '2008-07-31T23:00:00.000000000', '2008-08-31T10:00:00.000000000', '2008-09-30T20:00:00.000000000', '2008-10-31T06:00:00.000000000', ... But matlab shows ncdisp('Theta.nc') Source: /data/SOSE/SOSE/SO3/ITER105/bsose_i105_2008to2012_3day_Theta.nc Format: netcdf4 Global Attributes: _NCProperties = 'version=1|netcdflibversion=4.4.1|hdf5libversion=1.8.17' Dimensions: time = 609 XC = 1080 YC = 315 Z = 52 Variables: iter Size: 609x1 Dimensions: time Datatype: int64 Attributes: long_name = 'model timestep number' standard_name = 'timestep' time Size: 609x1 Dimensions: time Datatype: int64 Attributes: long_name = 'Time' standard_name = 'time' axis = 'T' units = 'days since 2008-01-04 00:00:00' calendar = 'proleptic_gregorian' and when I read this time gives iter: t = ncread('Theta.nc','time')' t = Columns 1 through 8 0 730 1461 2191 .... Any advice? Thanks Matt — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
ncdump -h Theta.nc netcdf Theta { dimensions: time = 60 ; XC = 1080 ; YC = 315 ; Z = 52 ; variables: int64 iter(time) ; iter:long_name = "model timestep number" ; iter:standard_name = "timestep" ; int64 time(time) ; time:long_name = "Time" ; time:standard_name = "time" ; time:axis = "T" ; time:units = "hours since 2008-01-31 10:00:00" ; time:calendar = "proleptic_gregorian" ; float XC(XC) ; XC:_FillValue = NaNf ; XC:coordinate = "YC XC" ; XC:units = "degrees_east" ; XC:standard_name = "longitude" ; XC:long_name = "longitude" ; XC:axis = "X" ; float YC(YC) ; YC:_FillValue = NaNf ; YC:coordinate = "YC XC" ; YC:units = "degrees_north" ; YC:standard_name = "latitude" ; YC:long_name = "latitude" ; YC:axis = "Y" ; float Z(Z) ; Z:_FillValue = NaNf ; Z:units = "m" ; Z:positive = "down" ; Z:standard_name = "depth" ; Z:long_name = "vertical coordinate of cell center" ; Z:axis = "Z" ; float Depth(YC, XC) ; Depth:_FillValue = NaNf ; Depth:coordinate = "XC YC" ; Depth:units = "m" ; Depth:standard_name = "ocean_depth" ; Depth:long_name = "ocean depth" ; float rA(YC, XC) ; rA:_FillValue = NaNf ; rA:coordinate = "YC XC" ; rA:units = "m2" ; rA:standard_name = "cell_area" ; rA:long_name = "cell area" ; float drF(Z) ; drF:_FillValue = NaNf ; drF:units = "m" ; drF:long_name = "cell z size" ; drF:standard_name = "cell_z_size" ; float hFacC(Z, YC, XC) ; hFacC:_FillValue = NaNf ; hFacC:long_name = "vertical fraction of open cell" ; hFacC:standard_name = "cell_vertical_fraction" ; float THETA(time, Z, YC, XC) ; THETA:_FillValue = NaNf ; THETA:units = "degC" ; THETA:long_name = "Potential Temperature" ; THETA:standard_name = "THETA" ; THETA:coordinates = "hFacC Depth rA iter drF" ;
// global attributes: :_NCProperties = "version=1|netcdflibversion=4.4.1|hdf5libversion=1.8.17" ; }
And I'll note ncdump -v time Theta.nc time = 0, 730, 1461, 2191, 2921, 3651, 4381, 5112, 5842, 6572, 7302, 8032, 8763, 9493, 10223, 10953, 11683, 12414, 13144, 13874, 14604, 15334, 16065, 16795, 17525, 18255, 18985, 19716, 20446, 21176, 21906, 22636, 23367, 24097, 24827, 25557, 26287, 27018, 27748, 28478, 29208, 29938, 30669, 31399, 32129, 32859, 33589, 34320, 35050, 35780, 36510, 37240, 37971, 38701, 39431, 40161, 40891, 41622, 42352, 43082 ;
So it really is writing this....
A clue! If I only read one record it does write time as time!
ds = open_mdsdataset(data_dir, grid_dir, iters='all', delta_t=3600, ignore_unknown_vars=False, prefix=['diag_state'], ref_date="2008-1-1 0:0:0",geometry='sphericalpolar') ds['THETA'].to_netcdf('/data/mmazloff/TEST.nc') time = 0, 730, 1461, 2191, 2921, 3651, 4381, 5112, 5842, 6572, 7302, 8032, 8763, 9493, 10223, 10953, 11683, 12414, 13144, 13874, 14604, 15334, 16065, 16795, 17525, 18255, 18985, 19716, 20446, 21176, 21906, 22636, 23367, 24097, 24827, 25557, 26287, 27018, 27748, 28478, 29208, 29938, 30669, 31399, 32129, 32859, 33589, 34320, 35050, 35780, 36510, 37240, 37971, 38701, 39431, 40161, 40891, 41622, 42352, 43082 ;
BUT
ds = open_mdsdataset(data_dir, grid_dir, iters=1460, delta_t=3600, ignore_unknown_vars=False, prefix=['diag_state'], ref_date="2008-1-1 0:0:0",geometry='sphericalpolar') ds['THETA'].to_netcdf('/data/mmazloff/TEST.nc') time = 5256000 ;
So iters='all' fails but iters=1460 works
Matt, sorry for the slow response. FYI, this is fundamentally an xarray issue, not an xmitgcm issue.
The key clue is in the metadata attributes. Xarray, via netcdf4-python, is writing the time variable as
int64 time(time) ;
time:long_name = "Time" ;
time:standard_name = "time" ;
time:axis = "T" ;
time:units = "hours since 2008-01-31 10:00:00" ;
time:calendar = "proleptic_gregorian" ;
Note the value of time:units
.
This is a perfectly valid time encoding according to CF conventions. The numbers are correct, the units are just not what you wish (seconds). If you were to-reopen the netcdf file using xarray, it would see the correct time.
Xarray's choices about encoding are described in the encoding documentation:
By default, xarray uses the ‘proleptic_gregorian’ calendar and units of the smallest time difference between values, with a reference time of the first time value.
In your case this is evidently hours.
The problem is that you are opening it in MATLAB but neglecting the units attribute.
So you have two choices
If you want to have different units for time, you can explicitly set the encoding when you call to_netcdf
. For example:
encoding = {'time': {'units': 'seconds since 2008-01-01'}}
ds['Theta'].to_netcdf('Theta.nc', encoding=encoding)
Excellent - this: encoding = {'time': {'units': 'seconds since 2008-01-01'}} ds['Theta'].to_netcdf('Theta.nc', encoding=encoding) worked! Now its giving seconds using both ncdump and matlab's ncread
Thanks! Matt
Glad we could sort it out.
You don't have to sign your github comments. We can see who you are! 😝
Hello
I am writing variables to netcdf and finding that the time is being read in Matlab as the iteration number instead of time in seconds.
Here is what I do
Note time is correct here:
But matlab shows ncdisp('Theta.nc') Source: Theta.nc Format: netcdf4 Global Attributes: _NCProperties = 'version=1|netcdflibversion=4.4.1|hdf5libversion=1.8.17' Dimensions: time = 609 XC = 1080 YC = 315 Z = 52 Variables: iter Size: 609x1 Dimensions: time Datatype: int64 Attributes: long_name = 'model timestep number' standard_name = 'timestep' time Size: 609x1 Dimensions: time Datatype: int64 Attributes: long_name = 'Time' standard_name = 'time' axis = 'T' units = 'days since 2008-01-04 00:00:00' calendar = 'proleptic_gregorian'
and when I read this time gives iter:
t = ncread('Theta.nc','time')'
t =
Columns 1 through 8
Any advice?
Thanks Matt