intake / intake-esm

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
https://intake-esm.readthedocs.io
Apache License 2.0
130 stars 42 forks source link

New Fsspec breaks local file serialization #635

Closed aulemahal closed 8 months ago

aulemahal commented 8 months ago

Here's a quick checklist in what to include:

Description

Serializing a catalog to a local filesystem is broken with fsspec 2023.10.0.

What I Did

import intake
url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
cat = intake.open_esm_datastore(url)

cat.serialize('mycopy', catalog_type='file')

and I got:

FileNotFoundError: [Errno 2] No such file or directory: "/home/meself/example/path/('file', 'local'):///home/meself/example/path/mycopy.csv"

The culprit is here : https://github.com/intake/intake-esm/blob/bfdfb5123d1df3d2bcf9b42493d81607e83e547b/intake_esm/cat.py#L185

In the recent fsspec release, Local filesystem now accepts both "file" and "local" as protocols. This, mapper.fs.protocol is a tuple, and not a string anymore. I'm not sure if this was always the case, but I see in the git blame that typing annotation showing this were added 5 months ago (see https://github.com/fsspec/filesystem_spec/blame/e20f626b87b5bb87d223495a56aefd768272a7ca/fsspec/spec.py#L107)

I see a mapper.fs.unstrip_protocol method that does what our f-string tries too. I can push a PR with this fix.

Version information: output of intake_esm.show_versions()

INSTALLED VERSIONS ------------------ cftime: 1.6.2 dask: 2023.3.0 fastprogress: 1.0.3 fsspec: 2023.10.0 gcsfs: 2023.10.0 intake: 0.7.0 intake_esm: 2022.9.18.post7 netCDF4: 1.6.0 pandas: 2.1.1 requests: 2.31.0 s3fs: 2023.10.0 xarray: 2023.10.1 zarr: 2.16.1 ```
aulemahal commented 8 months ago

I should have added that this is the cause of the latest nightly CI failures.