plazi / treatmentBank

Repository devoted to house keeping of treatmentBank
0 stars 0 forks source link

zookeys doublette #88

Closed myrmoteras closed 1 year ago

myrmoteras commented 1 year ago

this Zookeys article has two copies, @gsautter can you please check? tx

1 | FF8D0423FFD8FFD9B85BFF8E7F74FFC3 | 0 | Kawahara, Akito Y. & Rubinoff, Daniel | Three new species of Fancy Case caterpillars from threatened forests of Hawaii (Lepidoptera, Cosmopterigidae, Hyposmocoma) | ZooKeys -- | -- | -- | -- | -- | -- 1 | FFF17C03FF83141EFF85156ED629F32A | 0 | Kawahara, Akito Y. & Rubinoff, Daniel | Three new species of Fancy Case caterpillars from threatened forests of Hawaii (Lepidoptera, Cosmopterigidae, Hyposmocoma) | ZooKeys 1 [FF8D0423FFD8FFD9B85BFF8E7F74FFC3](https://tb.plazi.org/GgServer/summary/FF8D0423FFD8FFD9B85BFF8E7F74FFC3) 0 Kawahara, Akito Y. & Rubinoff, Daniel Three new species of Fancy Case caterpillars from threatened forests of Hawaii (Lepidoptera, Cosmopterigidae, Hyposmocoma) ZooKeys 1 [FFF17C03FF83141EFF85156ED629F32A](https://tb.plazi.org/GgServer/summary/FFF17C03FF83141EFF85156ED629F32A) 0 Kawahara, Akito Y. & Rubinoff, Daniel Three new species of Fancy Case caterpillars from threatened forests of Hawaii (Lepidoptera, Cosmopterigidae, Hyposmocoma) ZooKeys
gsautter commented 1 year ago

True ... we actually have duplicates of a good few Pensoft articles, especially from before their introduction of UUIDs, when the only viable means of de-duplication was the URL, with known consequences when they moved their journals to their own individual subdomains ...

We cleaned up those duplicates years ago, but in that effort decided to keep the articles proper (mainly because we had put UUIDs on them) and only rename the treatments to treatmentDuplicate to get those duplicates out of SRS.

For ZooKeys volume 170 alone, we actually have three pairs of duplicates (mind the upload dates): https://tb.plazi.org/GgServer/dioStats/stats?outputFields=doc.articleUuid+doc.name+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate+bib.source+bib.volume+bib.issue+bib.firstPage+bib.lastPage+cont.pageCount+cont.treatCount&groupingFields=doc.articleUuid+doc.name+doc.uploadUser+doc.uploadDate+doc.updateUser+doc.updateDate+bib.source+bib.volume+bib.issue+cont.pageCount+cont.treatCount&orderingFields=-bib.firstPage+-bib.lastPage&FP-bib.source=ZooKeys&FP-bib.volume=170&format=HTML

What you can see there is that the number of treatments is 0 for one of each pair of duplicates, which is basically the effect of us renaming them to treatmentDuplicate as described above.

Meaning to say: as long as no more than one of such a pair of duplicates has a non-zero number of treatments (which does need cleaning up), this is the result of a legacy mishap that we decided to hold on to for the sake of UUID stability.