IQSS / dataverse-pm

Project management issue tracker for the Dataverse Project. Note: Related links and documents may not be public.
https://dataverse.org
0 stars 0 forks source link

Epic: GREI 2 Task 1 - Publish benchmarks for amount of Harvard Dataverse ORCIDs, ROR IDs, and related research object metadata #225

Open cmbz opened 4 months ago

cmbz commented 4 months ago

These benchmarks will help us determine if more ORCIDs, ROR IDs, and related research object metadata are being published as a result of our implementation of version 1 of the GREI Metadata Recommendations.

ORCIDs In the 12-months before we make relevant changes to Dataverse, we'll determine what percentage of author metadata published in Harvard Dataverse includes ORCIDs (both "valid" ORCID metadata - where "ORCID" is chosen for the Identifier Type and what's entered in the Identifier field follows the xxxx-xxxx-xxxx-xxxx format - and "invalid" ORCID metadata).

And for the 12-months after we've made the changes, which will let depositors include ORCIDs for other types of people associated with deposits, such as other types of contributors, we'll determine this percentage again to see if it has increased compared to the previous 12-month period.

For example, using metadata collected in August 2023 from most known Dataverse installations (https://doi.org/10.7910/DVN/8FEGUV), we can determine that of all of the author metadata that Harvard Dataverse published in 2022, 36.5 percent includes an ORCID. The remaining author metadata may include no identifiers or may include other types of identifiers for people or organizations.

See the Jupyter notebook to see how we got this metric. The CSV file produced by that notebook lists the same metric for many of the other Dataverse installations that the community knew of in August 2023.

ROR IDs Harvard Dataverse and most known installations of Dataverse do not record ROR IDs.

For the 12-months before and after we've made changes to Dataverse that lets it record ROR IDs, we'll determine what percentage of metadata about organizations associated with published deposits includes ROR IDs.

Related research object metadata In the 12-months before we make relevant changes to Dataverse, we'll determine what percentage of deposits published in Harvard Dataverse include the DOIs of related research articles (that is, DOIs in a Related Publication field).

And for the 12-months after we've made the changes, which will also let depositors include DOIs for other types of related research objects like related datasets, we'll determine this percentage again to see if it has increased compared to the previous 12-month period.

Year 4 task

cmbz commented 4 months ago

2024/04/10