intake / intake-esm

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
https://intake-esm.readthedocs.io
Apache License 2.0
135 stars 46 forks source link

Intake_esm feature: Support for Kerchunk (reference) on s3 file system #603

Open dwest77a opened 1 year ago

dwest77a commented 1 year ago

Related Issue: Intake_esm catalog hosted on an s3 file system which points to kerchunk files also hosted on the same file system cannot be accessed properly by existing intake_esm scripts.

Solution: A small change to line 53 onwards in intake_esm/source.py:

if data_format == 'reference':
    if 's3://' in urlpath:
        import s3fs
        xarray_open_kwargs['backend_kwargs']['storage_options']['fo'] = s3fs.S3FileSystem(**kwargs).open(urlpath, **kwargs)
    else:
        xarray_open_kwargs['backend_kwargs']['storage_options']['fo'] = urlpath

Alternatives: There might be other options to consider/features in intake_esm I'm not taking advantage of, if anyone has any insight let me know. The above change relates to a later section: fsspec.get_mapper('reference://', fo=ref ...) Where for a kerchunk file on s3, ref needs to be the S3FileSystem object not the string path.

agstephens commented 1 year ago

Hi @andersy005, hope you are well. Does this fix look okay to you? It would help us with providing access to Kerchunk files over S3 (with authz). Thanks