Open christine-e-smit opened 4 months ago
The geotiff-like zarr store does not appear to work, but this is not terribly surprising. Both nco and the python netCDF4 library use the NetCDF-C library to open zarr. I was unable to open this zarr store with the python netCDF4 library (https://github.com/zarr-developers/geozarr-spec/issues/39), so there's a good chance the issue is in the NetCDF-C library.
Using the zarr store without compression, I was able to create a zarr store using ncks to do a subset.
ncks -d latitude,0.,10. "file:///YOUR/PATH/zarr_no_compression.zarr#mode=nczarr,zarr" "file:///YOUR/PATH/subset.zarr#mode=nczarr,zarr"
I was then able to open this store with xarray:
In [1]: import xarray as xr
In [2]: ds = xr.open_zarr('subset.zarr')
<ipython-input-2-89d3eedb3e59>:1: RuntimeWarning: Failed to open Zarr store with consolidated metadata, but successfully read with non-consolidated metadata. This is typically much slower for opening a dataset. To silence this warning, consider:
1. Consolidating metadata in this existing store with zarr.consolidate_metadata().
2. Explicitly setting consolidated=False, to avoid trying to read consolidate metadata, or
3. Explicitly setting consolidated=True, to raise an error in this case instead of falling back to try reading non-consolidated metadata.
ds = xr.open_zarr('subset.zarr')
In [3]: ds
Out[3]:
<xarray.Dataset>
Dimensions: (latitude: 40, nv: 2, longitude: 1440, time: 2)
Coordinates:
* latitude (latitude) float32 0.125 0.375 0.625 ... 9.375 9.625 9.875
* longitude (longitude) float32 -179.9 -179.6 -179.4 ... 179.6 179.9
* time (time) datetime64[ns] 2000-01-01 2000-01-02
Dimensions without coordinates: nv
Data variables:
latitude_bounds (latitude, nv) float32 dask.array<chunksize=(40, 2), meta=np.ndarray>
longitude_bounds (longitude, nv) float32 dask.array<chunksize=(1440, 2), meta=np.ndarray>
time_bounds (time, nv) datetime64[ns] dask.array<chunksize=(2, 2), meta=np.ndarray>
variable (latitude, longitude, time) float32 dask.array<chunksize=(40, 720, 1), meta=np.ndarray>
Attributes:
history: Thu Feb 8 12:09:18 2024: ncks -d latitude,0.,10. file:///Users...
NCO: netCDF Operators version 5.1.9 (Homepage = http://nco.sf.net, C...
In [4]: ds['latitude']
Out[4]:
<xarray.DataArray 'latitude' (latitude: 40)>
array([0.125, 0.375, 0.625, 0.875, 1.125, 1.375, 1.625, 1.875, 2.125, 2.375,
2.625, 2.875, 3.125, 3.375, 3.625, 3.875, 4.125, 4.375, 4.625, 4.875,
5.125, 5.375, 5.625, 5.875, 6.125, 6.375, 6.625, 6.875, 7.125, 7.375,
7.625, 7.875, 8.125, 8.375, 8.625, 8.875, 9.125, 9.375, 9.625, 9.875],
dtype=float32)
Coordinates:
* latitude (latitude) float32 0.125 0.375 0.625 0.875 ... 9.375 9.625 9.875
Attributes:
bounds: latitude_bnds
standard_name: latitude
units: degrees_north
I tried using ncks to read something from s3 and ran into this error:
> ncks -m "s3://us-west-2.opendata.source.coop/zarr/geozarr-tests/zarr_no_compression.zarr"
ncks: ERROR file "s3://us-west-2.opendata.source.coop/zarr/geozarr-tests/zarr_no_compression.zarr" not found. It does not exist on the local filesystem, nor does it match remote filename patterns (e.g., http://foo or foo.bar.edu:file).
ncks: HINT file-not-found errors usually arise from filename typos, incorrect paths, missing files, or capricious gods. Please verify spelling and location of requested file. If the file resides on a High Performance Storage System (HPSS) accessible via the 'hsi' command, then add the --hpss option and re-try command.
(nco) gs6102m1csmit:~/Projects/zarr_nyc_2024/nco % ncks -m "s3://us-west-2.opendata.source.coop/zarr/geozarr-tests/zarr_no_compression.zarr#mode=nczarr,zarr"
ncks: INFO nco_fl_mk_lcl() failed to nc_open() this Zarr-scheme file even though NCZarr is enabled. HINT: Check that filename adheres to this syntax: scheme://host:port/path?query#fragment and that filename exists. NB: s3 scheme requires that netCDF be configured with –enable-nczarr-s3 option.
HINT: As of 20230321, a known problem is that NCO (and ncdump) have trouble reading compressed NCZarr datasets. This can manifest as error code -137, "NetCDF: NCZarr error". If the next line reports that error, the error may be due to this issue, i.e., to a codec issue uncompressing the dataset:
Translation into English with nc_strerror(-128) is "NetCDF: Attempt to use feature that was not turned on when netCDF was built."
ncks: ERROR file "s3://us-west-2.opendata.source.coop/zarr/geozarr-tests/zarr_no_compression.zarr#mode=nczarr,zarr" not found. It does not exist on the local filesystem, nor does it match remote filename patterns (e.g., http://foo or foo.bar.edu:file).
ncks: HINT file-not-found errors usually arise from filename typos, incorrect paths, missing files, or capricious gods. Please verify spelling and location of requested file. If the file resides on a High Performance Storage System (HPSS) accessible via the 'hsi' command, then add the --hpss option and re-try command.
Using ncks version 5.1.9, which is part of the nco tools, I was able to list a zarr store
Data used: https://github.com/zarr-developers/geozarr-spec/issues/36
To reproduce: