Open ivirshup opened 2 years ago
@ivirshup Thanks for trying this out! I replicated the segfault on my computer (Intel Macbook). The segfault happens immediately when netCDF4 tries to read the zarr file, so while the zarr file looks okay, the nczarr implementation breaks when attempting to read it.
After some experimentation, the issue comes from creating Variables that are not tied to Dimensions. netCDF4 allows this use case, and I use dimensionless-variables to store the array dimensions as 1-element arrays. netCDF4+HDF5 seems to handle this fine, so I'm not sure why zarr chokes.
This choice to use dimensionless-variables isn't necessary. I could easily store the dimension sizes in the metadata. I'll update the Writer and Reader so zarr becomes useable.
Update
The above doesn't apply anymore. The whole library has been revamped based on the binsparse v1.0 spec discussion.
During that work, I attempted to make it work with nczarr. However, I ran into a bug that is a showstopper. Until that gets fixed, I don't think I can make any progress with zarr support.
Hey @jim22k!
This is more of a comment than an actual issue or request, but just letting you know you almost have zarr support already.
The netcdf4 C library has a zarr implementation (which they are interested in splitting out). This library can write to that trivially. It reads fine as a zarr store, but I get a segmentation fault if I try to read it with netcdf4...
I did have to remove the
zlib
kwarg from this line to write it:https://github.com/jim22k/sscdf/blob/14167c8b6ede9a44d2f61996709062e6e1fc14d3/sscdf.py#L218
Example:
But it segfaults if I try to read it with netcdf. I have not tried to debug this.