Closed jordanpadams closed 1 year ago
@collinss-jpl per this comment:
or within the landing page URL
is there something within our DOI Service that requires the URLs to have the identifier in the URL? or were you just noting we could not find the identifiers anywhere so we could not extract them automatically?
@collinss-jpl per this comment:
or within the landing page URL
is there something within our DOI Service that requires the URLs to have the identifier in the URL? or were you just noting we could not find the identifiers anywhere so we could not extract them automatically?
The latter. There's functionality in the service that attempts to parse an identifier from the landing page URL query params as a last resort when it can't find an identifier anywhere else. It is not a strict requirement of the service that the URL's contain the identifier.
@collinss-jpl đź‘Ť excellent. just wanted to make sure I understood this correctly.
Email sent to SBN about this. will leave this ticket open for now as blocked, but since they have been notified, we may just close this out as completed since it is largely out of our hands at this point
@collinss-jpl another quick question: those JSON files attached contain all the SBN records? Or just the offending ones?
@jordanpadams All of them, they should be representative of what is returned when querying datacite for all records with the associated SBN prefix.
copy thanks.
closing this as completed as SBN was contacted and they are aware of this. they can fix them at their leisure or not, but the syncing capability is working well, so I think we can call this goood
Regarding the incompatibility of the current SBN DOI records:
Attached are dumps of the DOI records returned from DataCite for each of the SBN prefixes (the .txt extension can be safely removed): doi.10.26007.json.txt doi.10.26033.json.txt
Bad records: doi.10.26007.bad.json.txt
doi.10.26033.bad.json.txt
The 10.26007 prefix contains 363 records, of which only 24 are usable by the service as-is. For 10.26033, there are 135, of which only 1 (!) was importable.
Records which are not importable fall into one of two catagories (or both):
identifiers
section of the JSON or within the landing page URL (SBN seems to have its own landing page scheme?)For 10.26007, 90 records are "findable", and for 10.26033 there are 115 "findable" records. All others are in "draft" state and probably don't need immediate correcting.
Originally posted by @collinss-jpl in https://github.com/NASA-PDS/doi-service/issues/312#issuecomment-1023647575