NCAR / intake-esm-datastore

Intake-esm Datastore
Apache License 2.0
14 stars 11 forks source link

Relative path in JSON files? #75

Open mnlevy1981 opened 4 years ago

mnlevy1981 commented 4 years ago

Now that intake-esm supports a relative path from the json to csv.gz, I think all the json files in catalog/ with a full path should change to the relative path.

$ cd /glade/collections/cmip/catalog/intake-esm-datastore/catalogs
$ grep catalog_file *.json
campaign-cesm2-cmip6-timeseries.json:  "catalog_file": "/glade/collections/cmip/catalog/intake-esm-datastore/catalogs/campaign-cesm2-cmip6-timeseries.csv.gz",
glade-cesm1-cmip5-timeseries.json:  "catalog_file": "/glade/collections/cmip/catalog/intake-esm-datastore/catalogs/glade-cesm1-cmip5-timeseries.csv.gz",
glade-cesm1-le.json:  "catalog_file": "/glade/collections/cmip/catalog/intake-esm-datastore/catalogs/glade-cesm1-le.csv.gz",
glade-cmip5.json:  "catalog_file": "/glade/collections/cmip/catalog/intake-esm-datastore/catalogs/glade-cmip5.csv.gz",
glade-cmip6.json:  "catalog_file": "/glade/collections/cmip/catalog/intake-esm-datastore/catalogs/glade-cmip6.csv.gz",
mistral-cmip5.json:  "catalog_file": "/home/mpim/m300524/intake-esm-datastore/catalogs/mistral-cmip5.csv.gz",
mistral-cmip6.json:  "catalog_file": "/home/mpim/m300524/intake-esm-datastore/catalogs/mistral-cmip6.csv.gz",
mistral-miklip.json:  "catalog_file": "/home/mpim/m300524/intake-esm-datastore/catalogs/mistral-miklip.csv.gz",
mistral-MPI-GE.json:  "catalog_file": "/home/mpim/m300524/intake-esm-datastore/catalogs/mistral-MPI-GE.csv.gz",
pangeo-cmip6.json:  "catalog_file": "https://storage.googleapis.com/cmip6/pangeo-cmip6.csv",
stratus-cesm1-le.json:  "catalog_file": "stratus-cesm1-le.csv",

I plan on changing

campaign-cesm2-cmip6-timeseries.json
glade-cesm1-cmip5-timeseries.json
glade-cesm1-le.json

When I next update those catalogs, and stratus-cesm1-le.json is already doing this. I definitely think glade-cmip5.json and glade-cmip6.json should follow suit, though I am less familiar with mistral.

mnlevy1981 commented 4 years ago

Note: one big reason for suggesting this is that I think users (especially on glade) should be copying the json and csv.gz into a subdirectory of whatever project is using them so we don't inadvertently break their workflow by updating these catalogs. A user pointing to /glade/collections/cmip/catalog/intake-esm-datastore/catalogs/campaign-cesm2-cmip6-timeseries.json will be in for an unpleasant surprise when fixes to #64 (potentially breaking their searches) and / or #74 (potentially changing the data in the catalog) are made.