plazi / treatmentBank

Repository devoted to house keeping of treatmentBank
0 stars 0 forks source link

wrong zenodo ID for publication #86

Open myrmoteras opened 1 year ago

myrmoteras commented 1 year ago

in this case, an odd id is provided to the cited publication

https://zenodo.org/record/7784879#.ZD0U2HZBz8A https://zenodo.org/record/47096#.ZD0U6HZBz8A image

gsautter commented 1 year ago

This must be one of the very early cases where deposition number and record number are not the same ... the article record is this: https://zenodo.org/record/24918 , but when you hit "Edit", it takes you to https://zenodo.org/deposit/47096

According to @slint , this phenomenon exists for a few very early depositions and records (the record in question was created on Oct 8, 2015) ... hard one to resolve without major modifications and introduction of considerable redundancies on our end ... not sure it's worthwhile doing for what is likely merely a handful of articles ...

A little binary search up and down the low deposition numbers seems to identify 154,000 as the threshold from which onward deposition numbers match the associated record numbers ... the articles we have before that are few: https://tb.plazi.org/GgServer/dioStats/stats?outputFields=doc.articleUuid+doc.name+doc.doi+doc.zenodoDepId&groupingFields=doc.articleUuid+doc.name+doc.doi+doc.zenodoDepId&orderingFields=doc.zenodoDepId&FP-doc.zenodoDepId=1-154000&format=HTML

gsautter commented 1 year ago

Resolving this would take adding to those articles listed above an extra ID-Zenodo-Rec field and using it preferentially (over ID-Zenodo-Dep) where the record number is required rather than the deposition number ... figuring out the respective numbers would have to be pretty much a manual process, as (a) it inevitably requires interaction with the Zenodo website and (b) the relatively low number of affected articles doesn't really justify automation.

slint commented 1 year ago

An alternative is to check if we can fix these mismatches in case the same-numbered Deposit IDs are "free".

In this case, for https://zenodo.org/record/24918, the equivalent https://zenodo.org/deposit/24918 is free, so we could do some juggling to simplify things. This might not be the case for all of the records though, I can run the numbers tomorrow for the possibly affected records to see if we're saving much and get back here.

gsautter commented 1 year ago

@slint thanks, that would be great, and vastly helpful ... if we can boil this down to a handful of cases, the remaining few should be quite possible to sort out whichever way we identify as the most viable or least invasive.

slint commented 1 year ago

It looks like 952 out of the 4853 mismatching deposit IDs are available for the swap I mentioned. It's not a lot, but we can schedule to make the swap next week.

gsautter commented 1 year ago

@slint that would be vastly helpful already, and also iron out a few odd cases ... however, it still leaves open the question of what to do about the others. I think what we'll ultimately need to do to solve this is to store both IDs in the source IMFs on our end, and to use the alternative where required (most likely for editing, as that is far fewer places in the code to make aware of the distinction) ... Unfortunately, it's rather tedious to get the actual deposition number of a given record number ... only way I found is to open the record page and then look at the link on the "Edit" button ... is there a way of doing a similar thing via the API? Or might it be possible for you to export a list of pairs of record numbers and associated deposition numbers for us? The latter would most likely be the fastest, as it saves thousands of API lookups ...