Open vera opened 3 months ago
It looks like the citation code is assuming a / as a separator rather than using PidProvider specific code to create the entries. The specific issue of the / being 4 characters in is from using an unmanaged permalink. Because permalinks don't require a separator, there is no reliable way to tell the authority from the shoulder, so the code picks the first four chars as the authority.
@qqmyers Should we update the code to use the PIDProvider specific properties and create a PR?
I haven't looked at the code to be certain, but I think that makes sense. The GlobalId class has methods to get whatever form or part of a PID you want, so I think at this point, there shouldn't be core code outside that class hardcoding the protocol name or trying to parse/generate a PID for display.
It looks like the citation code is assuming a / as a separator rather than using PidProvider specific code to create the entries. The specific issue of the / being 4 characters in is from using an unmanaged permalink. Because permalinks don't require a separator, there is no reliable way to tell the authority from the shoulder, so the code picks the first four chars as the authority.
I see, that makes sense.
For completeness, here's what the export looks like with a managed Permalink:
BibTeX:
@data{NCT00080262_2024,
author = {$AUTHORS},
publisher = {Root},
title = {{$TITLE}},
year = {2024},
version = {V1},
doi = {https://clinicaltrials.gov/study//NCT00080262},
url = {http://localhost:8080/citation?persistentId=perma:https://clinicaltrials.gov/study/NCT00080262}
}
-> L1 seems fine, but the doi
property has an extra slash in a different position (before the unique part of the Permalink)
EndNote XML:
<electronic-resource-num>perma/https://clinicaltrials.gov/study//NCT00080262</electronic-resource-num>
-> same issue (extra slash before the unique part of the Permalink)
RIS citation is still fine.
Cool. I see https://github.com/IQSS/dataverse/blob/b67d732921a3e84d4450a5ee18790aeab07afaed/src/main/java/edu/harvard/iq/dataverse/DataCitation.java#L295-L298 which is where the hardcoded doi and / come from. I'm not sure what BibTeX allows for non-DOIs - looks like url is an option according to https://www.bibtex.com/g/bibtex-format/.
Yeah, I agree, "url" sounds like a good option when "doi" isn't available.
I just checked a dataset that uses Handles ( https://data.cimmyt.org/dataset.xhtml?persistentId=hdl:11529/10016 ) and the Bibtex output includes a false DOI like this:
doi = {11529/10016},
So yeah, it would probably be good to do something here to not assume DOIs.
What steps does it take to reproduce the issue?
https://clinicaltrials.gov/study/NCT00080262
)Two problems:
I'm seeing weird output in the BibTeX output in L1 and line
doi
(missinghttp
and extra slash afterhttp
).In the EndNote XML output, there is also an extra slash in
<electronic-resource-num>
.I briefly checked the code (BibTeX, EndNote XML) and I'm not sure why?
The RIS citation is fine.
doi
since it's not a DOIBibTeX:
EndNote XML:
<?xml version='1.0' encoding='UTF-8'?><xml><records><record><ref-type name="Dataset">59</ref-type><contributors><authors>...</authors></contributors><titles><title>...</title></titles><section>...</section><dates><year>...</year></dates><edition>...</edition><publisher>...</publisher><urls><related-urls><url>http://localhost:8080/citation?persistentId=perma:https://clinicaltrials.gov/study/NCT00080262</url></related-urls></urls><electronic-resource-num>perma/http/s://clinicaltrials.gov/study/NCT00080262</electronic-resource-num></record></records></xml>
Which version of Dataverse are you using?
6.2
Any related open or closed issues to this bug report?
not aware
Screenshots:
-
Are you thinking about creating a pull request for this issue?
yes, would be interested