pangeo-forge / cmip6-pipeline

Pipeline for cloud-based CMIP6 data ingestion pipeline
Apache License 2.0
1 stars 5 forks source link

Refactor cmip6 in the cloud codebase #19

Closed dgergel closed 2 years ago

dgergel commented 3 years ago

This PR is an initial refactor of the CMIP6-in-the-cloud codebase, developed as a Jupyter notebook-based user request system, to turn it into a set of functions that can be used in a pangeo-forge CMIP6 recipe. Functions are organized by what they are used for - cataloging ones are in catalog.py, ESGF-search related ones in esgf.py, and general ones in utils.py.

@naomi-henderson, it would be great if you could look over this and check if any of these functions should be deprecated and replaced with newer ones that you've updated since this refactor.

cc @cisaacstern

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

naomi-henderson commented 3 years ago

@dgergel , fantastic - I really like your re-factoring of the preliminary workflow! From specifying config.cfg to getting the ESGF_specific.csv dataframe of links, this worked perfectly for the test cases I tried. To me it seems like a good starting point for @cisaacstern to build out the recipe from the urls

I don't have any newer versions of these functions - these are essentially what I am using