zarr-developers / geozarr-spec

This document aims to provides a geospatial extension to the Zarr specification. Zarr specifies a protocol and format used for storing Zarr arrays, while the present extension defines conventions and recommendations for storing multidimensional georeferenced grid of geospatial observations (including rasters).
106 stars 10 forks source link

Panoply interoperability #38

Open christine-e-smit opened 4 months ago

christine-e-smit commented 4 months ago

Using Panoply version 5.2.9, I am unable to open a zarr store.

Data used: https://github.com/zarr-developers/geozarr-spec/issues/36

Panoply doesn't recognize this as a data type it knows how to open.

rschmunk commented 4 months ago

There are a couple different issues here:

1) Panoply is having trouble opening any zipped dataset, whether it's a zarr directory store or a vanilla netCDF file.

2) If I use a devo copy of Panoply that can select and open a zarr directory, then it runs into trouble with the example "GLDAS-NOAH025-3H" data at #36. It will give me a list of the directory contents and show me their attributes, but it cannot plot any of the variables therein because it seems that a compressor is used that is not available in the Unidata/netcdf-java library.

christine-e-smit commented 4 months ago

@rschmunk - Thanks for talking a look.

  1. It's definitely a bit weird to open zipped datasets. I gave it a try because I couldn't select a folder. Being able to select a folder makes much more sense.
  2. Charlie Zender said the same thing about nco, which uses the NetCDF-C library: https://sourceforge.net/p/nco/discussion/9829/thread/fcb404db8b/#c56d/3c73. https://github.com/zarr-developers/geozarr-spec/issues/36 uses the blosc compressor, which is zarr's default. I wonder if Unidata would consider adding this compressor since I'm guessing a huge fraction of zarr stores out there will be incompatible otherwise.
rschmunk commented 4 months ago

@christine-e-smit, I'm keeping the open-zarr-as-folder option for devo use only in Panoply until zarr capability in netCDF-Java advances to somewhere past its current optional build status. Besides the problem with missing compressors, there's also that NJ won't "enhance" the data and discover the lon-lat coordinate system because it only supports "pure Zarr v2". Which, it sounds like this geozarr discussion as well as NCZARR and I don't know what else are possibly intended to address.

Regarding the lack of compressors, there's probably an issue of there being too many things that the NJ developers are asked to add to the library, so some of them sit in the category of waiting for an outside developer who needs it to step forward and contribute. blosc support might be one of these, but see e.g. Unidata/netcdf-java#889.

rschmunk commented 4 months ago

I submitted Unidata/netCDF-Java#1307 to address the first problem in Panoply, but there remains some trouble with opening zipped data when the optional zarr code is in the build. And that doesn't begin to address the lack of compressors or useful metadata.

rschmunk commented 2 months ago

Some relatively minor issues with netCDF-Java library's ability to work with zarr have been resolved following e.g. Unidata/netcdf-java#1325 and Unidata/netcdf-java#1319. But IMO, there's no more to be done to make an NJ-dependent app such as Panoply better able to work with zarr data stores until larger issues such as compressor availability are addressed.

So at this point I'd mark this particular issue as resolved for being out-of-scope. 🙂