Closed zxdawn closed 2 years ago
Note that I can't reproduce it using this example:
I could be wrong but it appears that when you introduce a _FillValue
in your dataarray, you end up with the same outcome:
In [53]: import numpy as np
...: import xarray as xr
...:
...: da = xr.DataArray(np.array([1,2,4294967295], dtype='uint')).rename('test_array')
In [56]: da.encoding['_FillValue'] = 4294967295
In [62]: da.to_netcdf("test.nc", engine='netcdf4')
In [63]: !ncdump -h test.nc
netcdf test {
dimensions:
dim_0 = 3 ;
variables:
uint64 test_array(dim_0) ;
test_array:_FillValue = 4294967295ULL ;
data:
test_array = 1, 2, _ ;
}
In [64]: d = Dataset("test.nc")
In [65]: d
Out[65]:
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
dimensions(sizes): dim_0(3)
variables(dimensions): uint64 test_array(dim_0)
groups:
In [66]: xr.open_dataset('test.nc')
Out[66]:
<xarray.Dataset>
Dimensions: (dim_0: 3)
Dimensions without coordinates: dim_0
Data variables:
test_array (dim_0) float64 ...
In [67]: xr.open_dataset('test.nc').test_array
Out[67]:
<xarray.DataArray 'test_array' (dim_0: 3)>
array([ 1., 2., nan])
Dimensions without coordinates: dim_0
Notice that xarray is using np.NaN
as a sentinel value for the missing
/ fill_values
. Because np.NaN
is a float, this forces the entire array of integers to become floating pointing numbers...
Ha, thanks. It makes sense now. Shall we close this?
Ha, thanks. It makes sense now. Shall we close this?
Great! I'm closing this for the time being...
What happened:
The
uint
data type variables are read asfloat64
instead of the correctuint
type.Minimal Complete Verifiable Example:
Anything else we need to know?:
The sample data is attached here. The output of
ncdump -h test_save.nc
:Note that I can't reproduce it using this example:
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.11.0-40-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.20.3 scipy: 1.7.3 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: 3.6.0 Nio: None zarr: 2.10.3 cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: None dask: 2021.11.2 distributed: 2021.11.2 matplotlib: 3.5.0 cartopy: 0.20.1 seaborn: None numbagg: None fsspec: 2021.11.1 cupy: None pint: 0.18 sparse: None setuptools: 59.4.0 pip: 21.3.1 conda: 4.11.0 pytest: None IPython: 7.30.0 sphinx: None