intake / intake-xarray

Intake plugin for xarray
https://intake-xarray.readthedocs.io/
BSD 2-Clause "Simplified" License
74 stars 36 forks source link

Reading a list of zarr stores with intake-xarray #110

Closed sebastienlanglois closed 2 years ago

sebastienlanglois commented 2 years ago

Hi everyone and thank you for this great library!

Use case

I have a use case in which I would like to pass a list of zarr stores located in S3 to xr.open_mfdataset([ ]..., engine='zarr'). As per my understanding, it is not possible to do this with intake-xarray yet because the ZarrSource class assumes only one path as urlpath. In the beginning of 2021, there were pull requests to address this issue but these were abandoned as xarray was going to provide the logic for the use case. I believe that this has now been added in xarray.

How I addressed the use case

I was able to successfully address my use case by :

Question

I was wondering if we could rewrite the ZarrSource class similarly to NetCDFSource but with the fsspec logic removed in intake-xarray as it is already taken care of by xarray. This would allow intake-xarray to read one or multiple zarr stores and would address the issue of xr.open_zarr being depreciated. As I'm not familiar with the inner workings of intake, maybe there is a better way of doing this but if this approach sounds interesting, I'm available to submit a PR with the required tests.

Thanks!

martindurant commented 2 years ago

I think you are quite right, and the changes upstream in xarray means that we can simplify the Intake driver. I'm not certain if there's any extra functionality we provide here that is not available now directly in xarray. I would be happy to see a PR which defers to xarray's open_[mf]dataset.