xpublish-community / xpublish

Publish Xarray Datasets via a REST API.
https://xpublish.readthedocs.io
Apache License 2.0
167 stars 23 forks source link

Convert zarr dependencies to utils, update numpy chunk encoding #259

Closed mpiannucci closed 5 months ago

mpiannucci commented 5 months ago

We have been working on building a subset router (https://github.com/asascience-open/xreds/blob/fa3aa81e398c280cef34fd6e0846880df0bb2aef/xreds/plugins/subset_plugin.py#L138) which introduces a nested dataset router. The core zarr plugin did not work with this because of zmetadata and zvariable dependencies using the xpublish global get_dataset dependency. Instead this simply moves those functions to utils and removes the Depends functionality. If there is a better way to handle this i am all ears, I was not sure of why they were dependencies in the first place so this may be incorrect.

This also includes a patch for numpy arrays when using the zarr router (https://github.com/xpublish-community/xpublish/issues/207). In some cases (especially kerchunk concatenated datasets) there may be a combination of numpy and dask arrays, and the numpy arrays may include encoding information even if the encoding is reset beforehand. This PR changes this functionality to force the encoding to match the array shape when the underlying array is not a dask array.

mpiannucci commented 5 months ago

pre-commit.ci autofix