pangeo-data / pangeo-cmip6-cloud

Documentation for Pangeo CMIP6 data stored in GCP/AWS cloud
https://pangeo-data.github.io/pangeo-cmip6-cloud/
17 stars 9 forks source link

Suggestion for a name change slight rebranding of the pangeo CMIP6 cloud holdings #42

Open jbusecke opened 2 years ago

jbusecke commented 2 years ago

I recently read @balaji-gfdl s paper. And if I understand this figure

image

right, what we are providing is in fact an 'ESGF replica Cache'.

I propose that we unify the naming of our 'cloud holdings' to reflect this fact:

"Pangeo CMIP6 Cache" "Pangeo CMIP6 Cloud Cache"

The “Cache” makes it clear that ESGF is still the main holding, and that we are leaving stuff as much ‘as is’ as possible. It also makes things a bit more clear in terms of provenance. We do not provide a doi, instead we just aim to have a fully trackable way to generate the cache from ESGF data via pangeo-forge (still WIP).

Happy to make a PR and change this where appropriate, but I wanted to float this idea for comments/feedback first.

cc @AparnaRaveendran @rabernat @cisaacstern @agstephens

cisaacstern commented 2 years ago

If this refers to NetCDF holdings on the cloud, then I agree.

Regarding the WIP with Pangeo Forge, it seems that some type of language regarding the analysis-ready cloud-optimized (ARCO) nature of the cache would be an important clarification to provide in the naming scheme.

jbusecke commented 2 years ago

Good point. I initially actually suggested Pangeo CMIP6 ARCO Cache to @rabernat but thought that might be too long? Do you think this would work?

cisaacstern commented 2 years ago

I'll be interested to know what others think, but I wonder if a "cache" implies less transformation than ARCO requires? I recognize the value of aligning with Balaji's typology, but perhaps the Pangeo Forge activities are outside of the categories that framework defines?