adsabs / ADSIngestParser

Curation parser library
MIT License
0 stars 7 forks source link

Ensure parsing of DataCite Documents that have no primary DOI in metadata #34

Open aaccomazzi opened 1 year ago

aaccomazzi commented 1 year ago

Some Zenodo records that we harvest from the astronomy thesis collection have DOIs issued by a provider different from Zenodo. An example of such record is this entry: https://zenodo.org/record/7371814, and Zenodo is simply used to re-publish the content already registered by the University. For comparison, this thesis was registered with a Zenodo DOI: https://zenodo.org/record/7213579.

The effect of this is that the datacite record for https://zenodo.org/record/7371814 contains the original DOI in this section:

     <relatedIdentifier relatedIdentifierType="DOI" relationType="IsIdenticalTo">10.25972/OPUS-29007</relatedIdentifier>

Whereas the record https://zenodo.org/record/7213579 contains the Zenodo minted DOI at the top level:

     <identifier identifierType="DOI">10.5281/zenodo.7213579</identifier>

We want to make sure that our parsing library is able to deal with both, in the first case via an option that specifies that parsing republished (as opposed to primary) datacite records is ok.