intake / intake-esm

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
https://intake-esm.readthedocs.io
Apache License 2.0
137 stars 46 forks source link

Allow collection to be on file system while catalog is in the cloud #351

Open wachsylon opened 3 years ago

wachsylon commented 3 years ago

When using an esm_collection from file system:

intake.open_esm_datastore('/home/dkrz/k204210/intake-esm/esm-collections/dkrz_era5_disk_grb_fromcloud.json')

with url as catalog_file entry value:

cat /home/dkrz/k204210/intake-esm/esm-collections/dkrz_era5_disk_grb_fromcloud.json
...
  "catalog_file": "https://swift.dkrz.de/v1/dkrz_a44962e3ba914c309a7421573a6949a6/intake-esm/dkrz_era5_disk_grb.csv.gz",

results in

Unable to find: /home/dkrz/k204210/intake-esm/esm-collections/https:/swift.dkrz.de/v1/dkrz_a44962e3ba914c309a7421573a6949a6/intake-esm/dkrz_era5_disk_grb.csv.gz

I think there is no reason why this should not work.

andersy005 commented 3 years ago

@wachsylon, I concur with you. This is a bug in _fetch_catalog function, and we should relax some of the assumptions made in this function

https://github.com/intake/intake-esm/blob/3092400e7797975ba22795e880385514529ba9ce/intake_esm/utils.py#L64

I am happy to look into it, but if you have time and are interested in working on it, please feel free to submit a PR :)