plazi / treatmentBank

Repository devoted to house keeping of treatmentBank
0 stars 0 forks source link

Specimen and collection code httpURI mixed up #58

Open myrmoteras opened 2 years ago

myrmoteras commented 2 years ago

in this search, it seems that a collection code PID has been added to the specimen code https://tb.plazi.org/GgServer/srsStats/stats?outputFields=doc.uuid+doc.doi+doc.wikiDataId+doc.gbifTaxonId+doc.uploadYear+bib.source+tax.name+matCit.gbifOccurrenceId+matCit.gbifSpecimenId+matCit.verbatimMatCit+matCit.specimenHttpUri&groupingFields=doc.uuid+doc.doi+doc.wikiDataId+doc.gbifTaxonId+doc.uploadYear+bib.source+tax.name+matCit.gbifOccurrenceId+matCit.gbifSpecimenId+matCit.verbatimMatCit+matCit.specimenHttpUri&limit=100&FP-doc.doi=1-&FP-doc.gbifTaxonId=1-&FP-doc.uploadYear=2022&FP-bib.source=%22European%20Journal%20of%20Taxonomy%22&FP-matCit.specimenHttpUri=1-&format=HTML

e.g https://tb.plazi.org/GgServer/html/038F87A6610ADA49FD8E52CD2781FACE or https://tb.plazi.org/GgServer/html/03D08794FFDDFFF0ECE594345C8EFA70

gsautter commented 2 years ago

Looks like the collection codes were tagged first (and correctly assigned the GrSciColl httpUri), later extended to include the specimen codes, and then re-typed to specimen codes ... only way I can think of this constellation came to be.

Should I introduce a QC rule that detects such httpUris on specimen codes, e.g. going by prefixes http://biocol.org/urn and http://grbio.org/cool?