Add intake catalog - Githubissues

cisaacstern commented 3 years ago

Opening this as a draft PR. As of this comment, I've just copied the relevant files from https://github.com/ocean-eddy-cpt/cpt-data. More to follow shortly.

review-notebook-app[bot] commented 3 years ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

cisaacstern commented 3 years ago

@roxyboy, the first catalog draft is ready for your review. To try it out:

git clone https://github.com/cisaacstern/swot_adac_ogcms.git
cd swot_adac_ogcms
git checkout intake-catalog

And see if intake_demo.ipynb works for you, and if you are able to load other catalog entries using the example in the notebook. This is my first try using Intake and it's very cool; really streamlines loading the data.

If this all checks out, we can merge into main and then I'll add the additional datasets to the catalog as they come online.

roxyboy commented 3 years ago

And see if intake_demo.ipynb works for you, and if you are able to load other catalog entries using the example in the notebook. This is my first try using Intake and it's very cool; really streamlines loading the data.

If this all checks out, we can merge into main and then I'll add the additional datasets to the catalog as they come online.

Thanks for working on this @cisaacstern ! I tried it out and got the following error:

GIGATL parameters and their allowable args are:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-bbea6543c079> in <module>
      1 for entry in entries:
      2     print(f"{entry} parameters and their allowable args are:")
----> 3     description = cat[entry].describe()
      4     params = description["user_parameters"]
      5     if len(params) != 0:

/mnt/meom/workdir/uchidat/miniconda3/envs/fsuport/lib/python3.8/site-packages/intake/catalog/base.py in __getitem__(self, key)
    392             if e.container == 'catalog':
    393                 return e(name=key)
--> 394             return e()
    395         if isinstance(key, str) and '.' in key:
    396             key = key.split('.')

/mnt/meom/workdir/uchidat/miniconda3/envs/fsuport/lib/python3.8/site-packages/intake/catalog/entry.py in __call__(self, persist, **kwargs)
     75             raise ValueError('Persist value (%s) not understood' % persist)
     76         persist = persist or self._pmode
---> 77         s = self.get(**kwargs)
     78         if persist != 'never' and isinstance(s, PersistMixin) and s.has_been_persisted:
     79             from ..container.persist import store

/mnt/meom/workdir/uchidat/miniconda3/envs/fsuport/lib/python3.8/site-packages/intake/catalog/local.py in get(self, **user_parameters)
    282             return self._default_source
    283 
--> 284         plugin, open_args = self._create_open_args(user_parameters)
    285         data_source = plugin(**open_args)
    286         data_source.catalog_object = self._catalog

/mnt/meom/workdir/uchidat/miniconda3/envs/fsuport/lib/python3.8/site-packages/intake/catalog/local.py in _create_open_args(self, user_parameters)
    256 
    257         if len(self._plugin) == 0:
--> 258             raise ValueError('No plugins loaded for this entry: %s\n'
    259                              'A listing of installable plugins can be found '
    260                              'at https://intake.readthedocs.io/en/latest/plugin'

ValueError: No plugins loaded for this entry: zarr
A listing of installable plugins can be found at https://intake.readthedocs.io/en/latest/plugin-directory.html .

Does intake need to a specific version?

cisaacstern commented 3 years ago

My mistake, @roxyboy. Intake needs the intake-xarray plugin to open the Zarrs. I just pushed this as the first line to the notebook, which should resolve the problem:

!pip install intake intake-xarray

Does it work now? And is it intuitive for you to open other datasets aside from the first one I've demoed? I can adjust the interface if not. Thanks!

roxyboy commented 3 years ago

Does it work now? And is it intuitive for you to open other datasets aside from the first one I've demoed? I can adjust the interface if not. Thanks!

Yep, works like magic now :) For INALT60 and FESOM, I'm getting:

INALT60 parameters and their allowable args are:
    Not implemented.

FESOM parameters and their allowable args are:
    Not implemented.

but does this mean that the data isn't there (yet)?

cisaacstern commented 3 years ago

works like magic

Awesome! Intake is so cool.

but does this mean that the data isn't there (yet)?

Yep, FESOM (and the eNATL60 interior) are currently blocked by https://github.com/pangeo-forge/pangeo-forge-recipes/issues/93, which @rabernat is working on this week. INALT60 isn't in the catalog because we're waiting on the password-protected bucket, per the provider's request.

I'll merge the current draft of the catalog into main now, and push updates to it as the above-referenced datasets (and other swot_adac datasets) come online.

pangeo-data / swot_adac_ogcms

Add intake catalog #1