pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.63k stars 1.09k forks source link

Automatically fall back to fsspec if s3/gcs/adfs path is provided in ``open_mfdataset`` #9723

Closed phofl closed 19 hours ago

phofl commented 2 weeks ago

Is your feature request related to a problem?

Currently, passing a list of s3 (or any other cloud storage) to open_mfdataset will raise, which is not a great ux. Users have to open the file handles for every file themselves and pass those in

Describe the solution you'd like

I'd like a solution similar to what open_zarr is doing in (_normalize_store_arg_v2), probably factor this out in a helper function and call in both places. Happy to put up a PR for this

cc @dcherian FYI

Describe alternatives you've considered

No response

Additional context

No response

dcherian commented 2 weeks ago

We discussed at this meeting today. There was broad support for this idea, but also a request to limit the implementation to relative simple, low-level fsspec API since we might want to swap out fsspec for an alternative in the future.

Can you send in a PR please?

Also turns out zarr does the resolving on its own now, so we might be able to delete xarray's fsspec code path

phofl commented 2 weeks ago

Thanks!

Isolating things to a single helper function would be suitable for this? That would keep the scope limited to a single function that you could swap out whenever

dcherian commented 2 weeks ago

Yes that would be great. Thanks!

jrbourbeau commented 3 days ago

I opened up https://github.com/pydata/xarray/pull/9797 for this

dcherian commented 19 hours ago

Closed by #9797