Open abunimeh opened 1 year ago
@abunimeh - Thanks for opening this issue. Can you expand on the feature a bit more? What API would you like to see? ds.to_netcdf(..., track_order=False)
?
I suspect this will need to be treated like invalid_netcdf
as it will only apply to the h5netcdf
backend:
_Note: it would be nice if we had backend_kwargs
on to_netcdf
since the variety of options scipy/netcdf4/h5netcdf support are increasingly different.
First, I totally agree with @jhamman having backend_kwargs
on to_netcdf
.
For the particular use case, netcdf-c/netCDF4-python create HDF5 files (NECTDF4
-format) with track order enabled as required, see https://docs.unidata.ucar.edu/netcdf-c/current/file_format_specifications.html#creation_order.
h5netcdf
uses track_order=True
as default since version 1.1.0. There have been (and still are, https://github.com/HDFGroup/hdf5/issues/1388) some corner case issues upstream which netcdf-c
can somehow circumvent, but h5netcdf
can't. Nevertheless, to be compliant with netcdf-c
track_order=True
is default for h5netcdf
.
@abunimeh As a workaround until this is sorted out you could create the file (or subgroup) using h5py
/h5netcdf
with track_order=False
. If a file (root-group) or sub-group in a file is created with track_order=False
this will be persistent as it is set at group-define time. Then you can use to_netcdf
as usual with mode="a"
to append.
import xarray as xr
import h5netcdf
from time import sleep
ds = xr.Dataset(data_vars=dict(hello=(["x"], [1., 1., 1., 1., 1.])))
track_order = False
group = "/track"
with h5netcdf.File("sample1.nc", "a", track_order=track_order) as f1:
if group.split("/")[-1]:
f1.create_group(group)
ds.to_netcdf("sample1.nc", mode="a", engine="h5netcdf", group=group)
sleep(5)
with h5netcdf.File("sample2.nc", "a", track_order=track_order) as f2:
if group.split("/")[-1]:
f2.create_group(group)
ds.to_netcdf("sample2.nc", mode="a", engine="h5netcdf", group=group)
Update: Use mode="a"
everywhere.
Update2: Cave: You will not be able to append to this file with netcdf-c/netCDF4-python ever again.
Thanks @kmuehlbauer for explaining this.
@jhamman yes, i was hoping that I can pass ds.to_netcdf(..., track_order=False)
when engine is hd5netcdf.
It would be nice to enhance backend_kwargs
Is your feature request related to a problem?
when using
h5netcdf
as a backend. Writing the same exact content to two different files results in unique md5 checksum for the two identical xarray files.See https://github.com/h5netcdf/h5netcdf/issues/211
Describe the solution you'd like
When saving an nc file. allow
track_order=False
to be passed as an argDescribe alternatives you've considered
using
netcdf4
engineAdditional context
No response