IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
857 stars 482 forks source link

DataCite Citations #10635

Open Esteban-Mtz opened 2 weeks ago

Esteban-Mtz commented 2 weeks ago

What steps does it take to reproduce the issue?

When we run script https://guides.dataverse.org/en/latest/_downloads/7e1b7e580244f61d2ac2de279759d154/counter_weekly.sh

None

When we run script to update citations for Datasets, Dataverse uses api end point "updateCitationsForDataset" in src/main/java/edu/harvard/iq/dataverse/api/MakeDataCountApi.java

This method calls Datacite api to take json information about doi (https://api.datacite.org/events?doi="DOI"&source=crossref&page[size]=1000)

When Dataverse has all information, it uses method "parseCitations" in src/main/java/edu/harvard/iq/dataverse/makedatacount/DatasetExternalCitationsServiceBean.java to extract citations information.

This method discriminates between inbound ("cites", "references", "supplements") and outbound ("is-cited-by", "is-referenced-by","is-supplemented-by") relationships. After that, it converts the recived DOI into Dataverse DOI in both cases with:

String globalId = subjectUri.replace("https://", "").replace("doi.org/", "doi:").toUpperCase().replace("DOI:", "doi:");

In our case, our :Shoulder is "data" and globalId is transformed in "DATA" and Dataverse doesn't find globalID. Because Dataverse transforms https://doi.org/10.34810/data146 into doi:10.34810/DATA146

We don't know, because you transform the recived DOI with toUpperCase(). Is it necessary?

When we remove the last part ".toUpperCase().replace("DOI:", "doi:");". The script generates citations correctly.

MakeDataCount API

Should Dataverse change the received DOI to uppercase?

In the guide you don't say anything about that and query to put a new :Shoulder has lowercase.

curl -X PUT -d "MyShoulder/" http://localhost:8080/api/admin/settings/:Shoulder

Which version of Dataverse are you using?

We use 5.11.1 version, but the latest version also has toUpperCase().

Any related open or closed issues to this bug report?

No

An example:

https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi:10.34810/data146

imagen

But 0 Citations.

imagen

An example removing ".toUpperCase().replace("DOI:", "doi:");" 2 Citations

imagen

qqmyers commented 2 weeks ago

Thanks for reporting the issue! DOIs are supposed to be case insensitive according to https://support.datacite.org/docs/datacite-doi-display-guidelines#dois-urls-and-case-sensitivity, so I think the root problem is not that we use toUpperCase here, but that the comparison elsewhere is case sensitive. The code for handling PIDs has changed significantly since 5.11.1 (with multi PID Provider support), but in trying to reproduce this, I see citation counting is broken in the current release as well, even for those with an upper case shoulder. I'll work to get a fix into the current code. (For 5.11, I'm not sure what can be done without a code fix - using an uppercase shoulder would impact the display and assuming your file/S3 store is case sensitive, where the files need to be - so not a very useful work-around.)