datacite / bolognese

Ruby gem and command-line utility for conversion of DOI metadata
MIT License
40 stars 14 forks source link

Funder DOI validation #108

Closed prdanelli closed 3 years ago

prdanelli commented 3 years ago

Hello.

First of all, thank you to everyone involved in this project, its been a real time saver in getting to grips with all kinds of DOI functionality i'm building at the moment.

I believe I've found an issue with the way Funder DOIs are validated:

https://github.com/datacite/bolognese/blob/e21c858411db610d48365c653cf262b0c2a5235f/lib/bolognese/doi_utils.rb#L12

doi = ... (10\.13039\/)?(5.+)\z/.match(doi)).last Seems to require that funder IDs start with a 5.

However, the following Funder DOI is currently in use: 10.13039/100000050

http://data.crossref.org/fundingdata/funder/10.13039/100000050

By replacing the regex with /\A(?:(http|https):\/(\/)?(dx\.)?(doi.org|handle.test.datacite.org)\/)?(doi:)?(10\.13039\/ I was able to valid the DOIs.

2.7.2 :001 > doi = "10.13039/100000050"
 => "10.13039/100000050"
doi = Array(/\A(?:(http|https):\/(\/)?(dx\.)?(doi.org|handle.test.datacite.org)\/)?(doi:)?(10\.13039\/)?(5.+)\z/.match(doi)).la
st
 => nil
doi = Array(/\A(?:(http|https):\/(\/)?(dx\.)?(doi.org|handle.test.datacite.org)\/)?(doi:)?(10\.13039\/)?(.+)\z/.match(doi)).las
t
 => "100000050"

I'd be happy to put this in a PR for you - at the moment i'm having to keep an overriding method locally.

prdanelli commented 3 years ago

@mfenner @chrisgorgo @cjcolvar @kjgarza Sorry for the bump. Do you have any feedback on this issue?

prdanelli commented 3 years ago

@kjgarza I have put in a PR for this that adjusts the funder DOI regex a little to account for the extra variations that now exist. The spec that Crossref provide can be found here: https://www.wikidata.org/wiki/Property:P3153

richardhallett commented 3 years ago

Thanks again for the Report/PR. This is now merged and the latest release has the changes.

prdanelli commented 3 years ago

Thank you @richardhallett thats great.