plazi / treatmentBank

Repository devoted to house keeping of treatmentBank
0 stars 0 forks source link

articles with ZenodoDOI but no Zenodo deposition ID in srsstats #112

Open myrmoteras opened 6 months ago

myrmoteras commented 6 months ago

@gsautter why do we have articles that have an articles DOI minted by Zenodo, but without a Zenodo deposition DOI?

https://tb.plazi.org/GgServer/srsStats/stats?outputFields=doc.articleUuid+pubLnk.articleDoi+pubLnk.articleZenodoDepId&groupingFields=doc.articleUuid+pubLnk.articleDoi&FP-pubLnk.articleDoi=%25zenodo%25&FP-pubLnk.articleZenodoDepId=0&FA-pubLnk.articleZenodoDepId=count-distinct&format=HTML

gsautter commented 6 months ago

For what it looks like, these are articles that were uploaded to Zenodo by other means that from TB proper, and the article DOI came in with the metadata just like any other DOI ... that sure is true for all the ZooBank stubs, and also for all the documents that were uploaded as XMLs (everything pre-2016, Mammal Species of the World 3rd ed. (from that ostensibly clean CSV), etc.) ... looking into the PDFs uploaded 2021 onward.

Here I added some more info: https://tb.plazi.org/GgServer/srsStats/stats?outputFields=doc.uuid+doc.name+doc.articleUuid+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate+pubLnk.articleDoi+pubLnk.articleZenodoDepId&groupingFields=doc.name+doc.articleUuid+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate+pubLnk.articleDoi+pubLnk.articleZenodoDepId&orderingFields=doc.uploadUser&limit=400&FP-pubLnk.articleDoi=%22%25zenodo%25%22&FP-pubLnk.articleZenodoDepId=0&format=HTML

gsautter commented 6 months ago

The remaining articles are these: https://tb.plazi.org/GgServer/srsStats/stats?outputFields=doc.uuid+doc.name+doc.articleUuid+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate+pubLnk.articleDoi+pubLnk.articleGbifId+pubLnk.articleClbId+pubLnk.articleZenodoDepId&groupingFields=doc.name+doc.articleUuid+doc.uploadUser+doc.uploadDate+pubLnk.articleDoi+pubLnk.articleGbifId+pubLnk.articleClbId+pubLnk.articleZenodoDepId&orderingFields=doc.name&limit=400&FP-doc.name=!%22%25.xml%22%20!%22%25.html%22%20!%22%25.htm%22&FP-doc.uploadUser=!ZooBank&FP-doc.uploadDate=%222021-01-01%22-&FP-pubLnk.articleDoi=%22%25zenodo%25%22&FP-pubLnk.articleZenodoDepId=0&format=HTML

Looks as though the "Dipteron" articles were uploaded to Zenodo by someone else and the DOI was entered with the document metadata like any other ... in that case, there is no deposition number on our end.

The "ZoolStud.55" articles are linked to some really strange depositions (all of them actually treatments from some other publications) ... the only way I can imagine that happened is that someone accidentally entered the wrong DOI to pull the article metadata ... definitely needs manual correction.

The lone "InsectaMundi" seems to be linked to a faulty DOI as well, most likely also entered to fetch the metadata, same for the lone "ActaEntMusNatPra" (which is actually linked to a figure we uploaded) ... these also need manual correction.

The "zt" (Zootaxa) all have DOIs pointing to some other depositions as well, some of which are other Zootaxa uploaded by us, and some are figures from Zootaxa that we uploaded ... these as well need manual correction.

The "RevSuisseZool" actually were uploaded by us, but long before we actually imported them (Zenodo depositions created in 2015 and 2016, document ingested in TB in 2021) ... added the attributes to the IMFs.

The "TaxMonogNeotropHymenopt" were also uploaded by us, but manually rather then through the automated export from TB (the TB export always names the file source.pdf), and then the DOI, again, must have been entered to get the metadata ... also added these attributes to the IMFs.

gsautter commented 6 months ago

For reference, the articles that need manual correction are these: https://tb.plazi.org/GgServer/dioStats/stats?outputFields=doc.articleUuid+doc.name+doc.doi+doc.zenodoDepId+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate+treat.id&groupingFields=doc.articleUuid+doc.name+doc.doi+doc.zenodoDepId+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate&orderingFields=doc.name+-doc.zenodoDepId&limit=100&FP-doc.name=!%22%25.xml%22%20!%22%25.html%22%20!%22%25.htm%22%20!%22Dipteron%25%22&FP-doc.doi=%22%25zenodo%25%22&FP-doc.zenodoDepId=0&FP-doc.uploadDate=%222021-01-01%22-&format=HTML