NCEAS / metacat

Data repository software that helps researchers preserve, share, and discover data
https://knb.ecoinformatics.org/software/metacat
GNU General Public License v2.0
26 stars 12 forks source link

Add a citation for this repo #1480

Closed earnaud closed 4 months ago

earnaud commented 3 years ago

This is a minor feature that would gladly help me (and some more I believe): add a way to cite the last release of this tool. For example, a published DOI with Zenodo could make it.

mbjones commented 3 years ago

Agreed. We have been meaning to do that and will do so soon. Thanks for the prompt @earnaud

mbjones commented 3 years ago
mbjones commented 2 years ago

@taojing2002 let's be sure to issue a DOI for the 2.17.0 release. Let me know if you want to discuss the particulars of codemeta and software heritage.

mbjones commented 10 months ago

@taojing2002 we should definitely do this for the 3.0.0 release. I can help with the DOI if needed. I am retargeting the milestone.

artntek commented 7 months ago

Add lots of metadata, e.g.:

[@mbjones to attach example]

mbjones commented 7 months ago

Here's an example datacite.xml document from a recent PDG software release. This document is a "template" datcite.xml document (with just a shoulder in the DOI field) that can be used with the EZID API to mint a new DOI for a software release. You can use the API directly with curl, or with a client library such as the Python ezid-client-tools (https://github.com/CDLUC3/ezid-client-tools/) or the Java library we maintain (https://github.com/NCEAS/ezid).

DataCite XML template doc (datacite.xml) ```xml 10.18739/A2 Robyn Thiessen-Bock Robyn Thiessen-Bock 0000-0002-1615-3963 Arctic Data Center Juliet Cohen Juliet Cohen 0000-0001-8217-4028 Arctic Data Center Matthew B. Jones Matthew B. Jones 0000-0003-0077-4738 Arctic Data Center Lauren Walker Lauren Walker 0000-0003-2192-431X Arctic Data Center Viz-raster: raster data processing for geospatial visualization (version 0.9.1) Arctic Data Center 2023 Software 2023-12-29 swh:1:dir:fbddcdbdffb8749af19c32f7706c1325a0db3fc4 10.18739/A2CN6Z17T 0.9.1 Apache-2.0 National Science Foundation https://api.crossref.org/funders/100000001 2042102 Advancing Arctic research and education through data preservation and reuse at the Arctic Data Center National Science Foundation https://api.crossref.org/funders/100000001 1927720 NNA Track 1: Collaborative Research: The Permafrost Discovery Gateway: Navigating the new Arctic tundra through Big Data, artificial intelligence, and cyberinfrastructure ```

With that document, you can create a new DOI with the command:

./ezid3.py sb-nceas:${EZIDPASS} mint doi:10.18739/A2 datacite @datacite.xml

After the DOI is created, it will return the DOI to the client. You can then use that to download all of the metadata that DataCite knows about the DOI. For example:

./ezid3.py sb-nceas:${EZIDPASS} view doi:10.18739/A27W67732 > rasterize-package.xml

produces:

b'success: doi:10.18739/A27W67732'
b'_created: 1706755690'
b'_datacenter: CDL.UCSB'
b'_export: yes'
b'_owner: sb-nceas'
b'_ownergroup: sb-library'
b'_profile: datacite'
b'_shadowedby: ark:/c8739/a27w67732'
b'_status: public'
b'_target: https://github.com/PermafrostDiscoveryGateway/viz-raster/releases/tag/0.9.1'
b'_updated: 1706755783'
b'datacite: <?xml version="1.0"?>%0A<resource xmlns="http://datacite.org/schema/kernel-4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd"><identifier identifierType="DOI">10.18739/A27W67732</identifier><creators><creator><creatorName>Robyn Thiessen-Bock</creatorName><givenName>Robyn</givenName><familyName>Thiessen-Bock</familyName><nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0002-1615-3963</nameIdentifier><affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation></creator><creator><creatorName>Juliet Cohen</creatorName><givenName>Juliet</givenName><familyName>Cohen</familyName><nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0001-8217-4028</nameIdentifier><affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation></creator><creator><creatorName>Matthew B. Jones</creatorName><givenName>Matthew B.</givenName><familyName>Jones</familyName><nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0003-0077-4738</nameIdentifier><affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation></creator><creator><creatorName>Lauren Walker</creatorName><givenName>Lauren</givenName><familyName>Walker</familyName><nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0003-2192-431X</nameIdentifier><affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation></creator></creators><titles><title>Viz-raster: raster data processing for geospatial visualization (version 0.9.1)</title></titles><publisher>Arctic Data Center</publisher><publicationYear>2023</publicationYear><resourceType resourceTypeGeneral="Software">Software</resourceType><dates><date dateType="Created">2023-12-29</date></dates><alternateIdentifiers><alternateIdentifier alternateIdentifierType="SWHID">swh:1:dir:fbddcdbdffb8749af19c32f7706c1325a0db3fc4</alternateIdentifier></alternateIdentifiers><relatedIdentifiers><relatedIdentifier relatedIdentifierType="DOI" relationType="IsNewVersionOf">10.18739/A2CN6Z17T</relatedIdentifier></relatedIdentifiers><version>0.9.1</version><rightsList><rights rightsURI="https://spdx.org/licenses/Apache-2.0.html">Apache-2.0</rights></rightsList><fundingReferences><fundingReference><funderName>National Science Foundation</funderName><funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier><awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=2042102">2042102</awardNumber><awardTitle>Advancing Arctic research and education through data preservation and reuse at the Arctic Data Center</awardTitle></fundingReference><fundingReference><funderName>National Science Foundation</funderName><funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier><awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1927720">1927720</awardNumber><awardTitle>NNA Track 1: Collaborative Research: The Permafrost Discovery Gateway: Navigating the new Arctic tundra through Big Data, artificial intelligence, and cyberinfrastructure</awardTitle></fundingReference></fundingReferences></resource>'

My typical workflow would be to use ezid3.py view to get the current metadata doc, munge it to create a well-formed XML document, edit it to reflect the new release, and then create a new DOI using ezid3.py mint. HTH.

doulikecookiedough commented 5 months ago

@mbjones Thank you for the example, it helps a lot! Below is a modified version of the datacite.xml template/document that I would like to eventually generate a DOI for. Can you please take a look when you have a moment and provide feedback?

DataCite XML for Metacat 3.0.0 (datacite.xml) ```xml MISSING_TOFILL Jing Tao Jing Tao 0000-0002-1209-5268 Arctic Data Center Matthew Brooke Matthew Brooke 0000-0002-1472-913X Arctic Data Center Dou Mok DouMing Mok 0000-0002-6076-8092 Arctic Data Center Matthew B. Jones Matthew B. Jones 0000-0003-0077-4738 Arctic Data Center Metacat: Data Preservation and Discovery System (3.0.0) Arctic Data Center 2024 Software 2024-04-xx MISSING_TOFILL 3.0.0 GNU General Public License, version 2 Apache-2.0 Jason Hunter PostgresSQL National Science Foundation https://api.crossref.org/funders/100000001 2042102 Advancing Arctic research and education through data preservation and reuse at the Arctic Data Center National Science Foundation https://api.crossref.org/funders/100000001 1546024 Scientia Arctica: A Knowledge Archive for Discovery and Reproducible Science in the Arctic National Science Foundation https://api.crossref.org/funders/100000001 1443062 Beyond Data Discovery: Shared Services for Community Metadata Improvement National Science Foundation https://api.crossref.org/funders/100000001 1448821 Making Data Count: Developing a Data Metrics Pilot National Science Foundation https://api.crossref.org/funders/100000001 0830944 DataNet Full Proposal: DataNetONE (Observation Network for Earth) National Science Foundation https://api.crossref.org/funders/100000001 0225676 ITR Collaborative Research: Enabling the Science Environment for Ecological Knowledge National Science Foundation https://api.crossref.org/funders/100000001 9904777 Integrating Marine Ecology Data for Scientific Analysis and Resource Management: A Community Database Prototype National Science Foundation https://api.crossref.org/funders/100000001 99-80154 KDI: A Knowledge Network for Biocomplexity: Building and Evaluating a Metadata-based Framework for Integrating Heterogeneous Scientific Data ```

Questions & Follow Up:

doulikecookiedough commented 5 months ago

Status

Additional Summary/Notes from Slack:

Matt: basically, you pre-issue a DOI, add that to the README and datacite.xml, then when thesoftwar eis tagged, you register that tag with software heritage, and get the permalink URI, then can send the datacite.xml to ezid, and register the URL

Updates to working datacite.xml document:

To Do:

Helpful Link RE: Datacite Fields/Values

doulikecookiedough commented 5 months ago

datacite.xml document is good to proceed with.

doumok@Dous-MacBook-Pro XML % xmllint --schema 'http://schema.datacite.org/meta/kernel-4.5/metadata.xsd' '/Users/doumok/code/testing/XML/datacite.xml' --noout /Users/doumok/code/testing/XML/datacite.xml validates

- **Missing:** Date of creation/Release date needs to be updated

<details><summary>DataCite XML for Metacat 3.0.0 (datacite.xml)</summary>

```xml
<?xml version="1.0"?>
<resource xmlns="http://datacite.org/schema/kernel-4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd">
  <identifier identifierType="DOI">MISSING_TOFILL_GET_FROM_EZID_WHEN_READY</identifier>
  <creators>
    <creator>
      <creatorName>Jing Tao</creatorName>
      <givenName>Jing</givenName>
      <familyName>Tao</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0002-1209-5268</nameIdentifier>
      <affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation>
    </creator>
    <creator>
      <creatorName>Matthew Brooke</creatorName>
      <givenName>Matthew</givenName>
      <familyName>Brooke</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0002-1472-913X</nameIdentifier>
      <affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation>
    </creator>
    <creator>
      <creatorName>Dou Mok</creatorName>
      <givenName>DouMing</givenName>
      <familyName>Mok</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0002-6076-8092</nameIdentifier>
      <affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation>
    </creator>
    <creator>
      <creatorName>Matthew B. Jones</creatorName>
      <givenName>Matthew B.</givenName>
      <familyName>Jones</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org/">0000-0003-0077-4738</nameIdentifier>
      <affiliation affiliationIdentifier="https://ror.org/055hrh286" affiliationIdentifierScheme="ROR" schemeURI="https://ror.org/">Arctic Data Center</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Metacat: Data Preservation and Discovery System (3.0.0)</title>
  </titles>
  <publisher>Arctic Data Center</publisher>
  <publicationYear>2024</publicationYear>
  <resourceType resourceTypeGeneral="Software">Software</resourceType>
  <dates>
    <date dateType="Created">2024-04-30</date>
  </dates>
  <version>3.0.0</version>
  <rightsList>
    <rights rightsURI="https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html">GNU General Public License, version 2</rights>
    <rights rightsURI="https://spdx.org/licenses/Apache-2.0.html">Apache-2.0</rights>
    <rights rightsURI="http://www.servlets.com/cos/license.html">Jason Hunter</rights>
    <rights rightsURI="https://www.postgresql.org/about/licence/">PostgreSQL</rights>
  </rightsList>
  <fundingReferences>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=2042102">2042102</awardNumber>
      <awardTitle>Advancing Arctic research and education through data preservation and reuse at the Arctic Data Center</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE)</funderName>
      <funderIdentifier funderIdentifierType="Other">ESS-Dive</funderIdentifier>
      <awardTitle>ESS-Dive (2020)</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE)</funderName>
      <funderIdentifier funderIdentifierType="Other">ESS-Dive</funderIdentifier>
      <awardTitle>ESS-Dive (2017)</awardTitle>
    </fundingReference>
     <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1546024">1546024</awardNumber>
      <awardTitle>Scientia Arctica: A Knowledge Archive for Discovery and Reproducible Science in the Arctic</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1448821">1448821</awardNumber>
      <awardTitle>Making Data Count: Developing a Data Metrics Pilot</awardTitle>
    </fundingReference>   
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1443062">1443062</awardNumber>
      <awardTitle>Beyond Data Discovery: Shared Services for Community Metadata Improvement</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=1430508">1430508</awardNumber>
      <awardTitle>DataONE (Data Observation Network for Earth</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>Mellon Foundation</funderName>
      <funderIdentifier funderIdentifierType="Other">Mellon Foundation</funderIdentifier>
      <awardTitle>Mellon Foundation (2009)</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=0830944">0830944</awardNumber>
      <awardTitle>DataNet Full Proposal: DataNetONE (Observation Network for Earth)</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>Mellon Foundation</funderName>
      <funderIdentifier funderIdentifierType="Other">Mellon Foundation</funderIdentifier>
      <awardTitle>Mellon Foundation (2006)</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=0225676">0225676</awardNumber>
      <awardTitle>ITR Collaborative Research: Enabling the Science Environment for Ecological Knowledge</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=9904777">9904777</awardNumber>
      <awardTitle>Integrating Marine Ecology Data for Scientific Analysis and Resource Management: A Community Database Prototype</awardTitle>
    </fundingReference>
    <fundingReference>
      <funderName>National Science Foundation</funderName>
      <funderIdentifier funderIdentifierType="Crossref Funder ID">https://api.crossref.org/funders/100000001</funderIdentifier>
      <awardNumber awardURI="https://www.nsf.gov/awardsearch/showAward?AWD_ID=9980154">99-80154</awardNumber>
      <awardTitle>KDI: A Knowledge Network for Biocomplexity: Building and Evaluating a Metadata-based Framework for Integrating Heterogeneous Scientific Data</awardTitle>
    </fundingReference>
  </fundingReferences>
</resource>

doulikecookiedough commented 5 months ago

The above datacite.xml document is ready.

Pending Steps:

mbjones commented 4 months ago

@doulikecookiedough I found some errors in your funder list for the datacite.xml when reviewing it for MetacatUI. I corrected the ESS-DIVE awards and some others in the list we are generating for MetacatUI here: https://github.com/NCEAS/metacatui/issues/2359#issuecomment-2057879884 Please update your list with the correct agency titles and funder identifiers, and award titles. Thanks.

doulikecookiedough commented 4 months ago

Thank you for reviewing my document @mbjones!

Below is the updated datacite.xml document based on the list generated for MetacatUI.

DataCite XML for Metacat 3.0.0 (datacite.xml) ```xml MISSING_TOFILL_GET_FROM_EZID_WHEN_READY Jing Tao Jing Tao 0000-0002-1209-5268 Arctic Data Center Matthew Brooke Matthew Brooke 0000-0002-1472-913X Arctic Data Center Dou Mok DouMing Mok 0000-0002-6076-8092 Arctic Data Center Matthew B. Jones Matthew B. Jones 0000-0003-0077-4738 Arctic Data Center Metacat: Data Preservation and Discovery System (3.0.0) Arctic Data Center 2024 Software 2024-04-30 3.0.0 GNU General Public License, version 2 Apache-2.0 Jason Hunter PostgresSQL National Science Foundation https://doi.org/10.13039/100000001 2042102 Advancing Arctic research and education through data preservation and reuse at the Arctic Data Center U.S. Department of Energy https://doi.org/10.13039/100000015 The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE 2020) U.S. Department of Energy https://doi.org/10.13039/100000015 The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE 2017) National Science Foundation https://doi.org/10.13039/100000001 1546024 Scientia Arctica: A Knowledge Archive for Discovery and Reproducible Science in the Arctic National Science Foundation https://doi.org/10.13039/100000001 1448821 Making Data Count: Developing a Data Metrics Pilot National Science Foundation https://doi.org/10.13039/100000001 1443062 Beyond Data Discovery: Shared Services for Community Metadata Improvement National Science Foundation https://doi.org/10.13039/100000001 1430508 DataONE (Data Observation Network for Earth Mellon Foundation Mellon Foundation Mellon Foundation (2009) National Science Foundation https://doi.org/10.13039/100000001 0830944 DataNet Full Proposal: DataNetONE (Observation Network for Earth) Mellon Foundation Mellon Foundation Mellon Foundation (2006) National Science Foundation https://doi.org/10.13039/100000001 0225676 ITR Collaborative Research: Enabling the Science Environment for Ecological Knowledge National Science Foundation https://doi.org/10.13039/100000001 9904777 Integrating Marine Ecology Data for Scientific Analysis and Resource Management: A Community Database Prototype National Science Foundation https://doi.org/10.13039/100000001 99-80154 KDI: A Knowledge Network for Biocomplexity: Building and Evaluating a Metadata-based Framework for Integrating Heterogeneous Scientific Data ```

Pending Steps:

taojing2002 commented 4 months ago

@mbjones I added this section in the readme file:

Citation

Cite this software as:

Jing Tao, Matthew Brooke, Dou Mok, Matthew B. Jones. 2024. Metacat: DataONE data repository software (version 3.0.0). Arctic Data Center. doi:10.18739/A2D21R***

I have two questions about this part:

  1. Now I just put the current contributors there. Would you please let me know who we should put there and the order?
  2. Did we reserve (mint) a DOI for Metacat? If we haven't, which shoulder (knb or arctic) should I use to mint one?
mbjones commented 4 months ago

That looks fine for authors.

Let's use the ADC prefix, and we haven't assigned a DOI previously.

taojing2002 commented 4 months ago

Thanks!

taojing2002 commented 4 months ago

@mbjones @doulikecookiedough @artntek I minted a doi and its status is reserved. Its information is here: https://ezid.cdlib.org/id/doi:10.18739/A2VX0650Z After we tag Metacat, I will change the target url to the Software Heritage url and make it public.

A question, in MetacatUI's datacite.xml, it has an element:

<alternateIdentifier alternateIdentifierType="https://registry.identifiers.org/registry/swh"

Do we need it as well?

taojing2002 commented 4 months ago

After tagging metacat 3.0.0, I followed the instruction in this page to modify the doi's location url.