dandi / dandi-archive

DANDI API server and Web app
https://dandiarchive.org
13 stars 13 forks source link

Do try to (re)mint DOIs for prior failed to be minted but released dandisets #1953

Open yarikoptic opened 5 months ago

yarikoptic commented 5 months ago

Relates to

Discovered in the course of

ATM we have 2 (only! phewph) dandisets with failed to be produced DOI records:

❯ grep -B1 '<!DOCTYPE html>' dandi.bib | grep DANDISET
# DANDISET 000029
# DANDISET 000252

edit: updated as of 20241101 -- no new missing bib but here is a summary over also other datacite errors (as of 4.3 version of datacite we use, which is not final 4.3) and metadata errors (with current pydantic model) we have

❯ grep msg dandi_datacite.meta-errors.json | sort | uniq -c
      4    "msg": "Field required",
      3    "msg": "Input should be 'Anatomy'",
      3    "msg": "Input should be 'Disorder'",
      9    "msg": "String should match pattern '^[a-zA-Z0-9-]+:[a-zA-Z0-9-/\\._]+$'",
❯ grep message dandi_datacite.datacite-errors.json | sort | uniq -c
     16    "message": "'IsPublishedIn' is not one of ['IsCitedBy', 'Cites', 'IsSupplementTo', 'IsSupplementedBy', 'IsContinuedBy', 'Continues', 'IsDescribedBy', 'Describes', 'HasMetadata', 'IsMetadataFor', 'HasVersion', 'IsVersionOf', 'IsNewVersionOf', 'IsPreviousVersionOf', 'IsPartOf', 'HasPart', 'IsReferencedBy', 'References', 'IsDocumentedBy', 'Documents', 'IsCompiledBy', 'Compiles', 'IsVariantFormOf', 'IsOriginalFormOf', 'IsIdenticalTo', 'IsReviewedBy', 'Reviews', 'IsDerivedFrom', 'IsSourceOf', 'IsRequiredBy', 'Requires', 'IsObsoletedBy', 'Obsoletes']",
     90    "message": "'ROR' is not one of ['ISNI', 'GRID', 'Crossref Funder ID', 'Other']",
❯ grep 'No valid' dandi.bib
# No valid BibTeX for 000029/0.210712.1903. Starts with <!DOCTYPE html>
# No valid BibTeX for 000029/0.230317.1553. Starts with <!DOCTYPE html>
# No valid BibTeX for 000029/0.231017.1955. Starts with <!DOCTYPE html>
# No valid BibTeX for 000029/0.231017.1959. Starts with <!DOCTYPE html>
# No valid BibTeX for 000252/0.230408.2207. Starts with <!DOCTYPE html>
# No valid BibTeX for 000252/0.230408.2207. Starts with <!DOCTYPE html>
yarikoptic commented 5 months ago

@djarecka -- IIRC you had worked in the past with all that datacite doi minting. Could you please take a look into what can potentially has caused those two to miss a DOI and write a little helped to mint a DOI if it is missing? then I think we could just add it to that script I prototyped in https://github.com/dandi/dandi-cli/pull/1449 so we could mint DOIs in case if they are missing if we discover such cases while updating a bibliography file.