pangeo-data / pangeo-cmip6-cloud

Documentation for Pangeo CMIP6 data stored in GCP/AWS cloud
https://pangeo-data.github.io/pangeo-cmip6-cloud/
17 stars 9 forks source link

How to cite these efforts? #43

Open jbusecke opened 2 years ago

jbusecke commented 2 years ago

I just wanted to pin a point that @aradhakrishnanGFDL raised at the meeting today:

It would be good to have a way to cite the efforts that the broader cmip6-cloud group is providing.

We specifically do not want to create DOIs for each dataset, but I personally got some questions about how to cite our specific efforts 'on top' of the dataset DOIs.

For now I would just like to start a discussion here.

jbusecke commented 1 year ago

I am involved in a paper and would like to at least acknowledge these efforts here. Do you all think we could agree on some text that we could add to this documentation in a "how to cite" section along the lines of:

We acknowledge the work of the [Pangeo / ESGF Cloud Data Working Group](https://pangeo-data.github.io/pangeo-cmip6-cloud/) who uploaded and maintained Analysis-Ready Cloud Optimized zarr stores of CMIP6 datasets on public cloud storage hosted by Google and Amazon.

or something similar? Should we mention all institutions? Or list institutions an names on the website?

cc @rabernat @aradhakrishnanGFDL @agstephens

durack1 commented 1 year ago

@jbusecke just FYI, there is the standard "CMIP6: Terms of Use" that you could build off; this has evolved across CMIP3, CMIP5 etc - see https://pcmdi.llnl.gov/CMIP6/TermsOfUse

The meaty part is something like

“We acknowledge the World Climate Research Programme, which, through its Working Group on Coupled Modelling, coordinated and promoted CMIP6. We thank the climate modeling groups for producing and making available their model output, the Earth System Grid Federation (ESGF) for archiving the data and providing access, and the multiple funding agencies who support CMIP6 and ESGF.”

aradhakrishnanGFDL commented 1 year ago

I am involved in a paper and would like to at least acknowledge these efforts here. Do you all think we could agree on some text that we could add to this documentation in a "how to cite" section along the lines of:

We acknowledge the work of the [Pangeo / ESGF Cloud Data Working Group](https://pangeo-data.github.io/pangeo-cmip6-cloud/) who manage the hosting of the Analysis-Ready Cloud Optimized versions (or copies..) of CMIP6 datasets on public cloud storage hosted by Google and Amazon. 

reworded just a few things, but nothing major. Thank you for thinking about this. (remove "zarr" stores as the group worked on netcdf also; manage the hosting of the data rather than upload/maintain). I am tempted to add ASDI, but then the list may grow for the google grant also - something to consider if it helps with future grants.

or something similar? Should we mention all institutions? Or list institutions an names on the website?

We have the info Paul also refers to on our website, so may be add a reference to that as well so people can cite the origin. https://pangeo-data.github.io/pangeo-cmip6-cloud/licensing_citation.html

cc @rabernat @aradhakrishnanGFDL @agstephens

durack1 commented 1 year ago

Just another tidbit. We have been attempting to guide "best practice" of data use and citation, and with the DKRZ MIP6 citation service (http://bit.ly/CMIP6_Citation_Search) up and running across the bulk of the archive it would be helpful for data users to cite the sources they use - some guidance (which could do with a refresh) was included at https://pcmdi.llnl.gov/CMIP6/Guide/dataUsers.html#4-terms-of-use-and-citation-requirements

jbusecke commented 1 year ago

Thanks @durack1. I think there are two components here:

  1. Citing the original CMIP6 datasets (which is covered here). I believe we link to the sources you mention. If you think that this section could use additional text we can open a new issue and modify/add wording.
  2. Citing the efforts of this working group in particular. Since this was a multi-year effort of many folks, I believe it is appropriate to cite those efforts in addition to the above + acknowledge the ESGF efforts.

How about this @aradhakrishnanGFDL (decided on copies rather than versions, since we do not actually version things separately) for now.

We acknowledge the work of the [Pangeo / ESGF Cloud Data Working Group](https://pangeo-data.github.io/pangeo-cmip6-cloud/) who manage the ingestion and hosting of the Analysis-Ready Cloud Optimized copies of CMIP6 datasets on public cloud storage hosted by Google and Amazon. 

I also minted a DOI for the new LEAP ingestion feedstock, which we could include. I would love to have a similar acknowledgement for @naomi-henderson s and your work too.

So maybe once we have the opinion paper out (I will work on this soon) we should add that as a citation too and make sure that the author list reflects all of the above?