pangeo-forge / pangeo-forge-recipes

Python library for building Pangeo Forge recipes.
https://pangeo-forge.readthedocs.io/
Apache License 2.0
125 stars 54 forks source link

Documenting usage of datasets produced by pangeo-forge #72

Open TomAugspurger opened 3 years ago

TomAugspurger commented 3 years ago

What policy should pangeo-forge have on documenting how to use a dataset? The conda-forge analogy would say that you just point to the upstream's documentation. While the upstream documentation will certainly be useful for understanding the data, it won't necessarily help with using the ARCO dataset.

At a minimum, it'd be helpful to have an example showing how to load the dataset into the preferred container. This is analogous to conda-forge's conda install -c conda-forge name-of-package

>>> import intake, intake_stac  # assuming we're using intake as our recommended API
>>> my_dataset = intake.open_stac_catalog("/path/to/pangeo-catalog.json")[collection].to_dask()
>>> my_dataset
<xarray.Dataset>
...

Perhaps we cut if off there? Or perhaps we recommend / require recipes come with an example in pangeo-gallery? It now occurs to me that pangeo-gallery is federated, so we could have a gallery of notebooks in the recipe repository, and then register them with pangeo-gallery.

rabernat commented 3 years ago

Totally agree that we should provide usage hints.

To me, this is tightly coupled with cataloging (#25). All the dataset that get processed by forge will be ingested into a catalog. That catalog should contain sufficient information about the format that we can generate example code for opening the data. Ideally this would include examples from many languages, not just python.

Once we solve the catalog problem, then the feature you describe would probably live in https://github.com/pangeo-forge/pangeo-forge-vue-website.