plazi / treatmentBank

Repository devoted to house keeping of treatmentBank
0 stars 0 forks source link

Transit stats: treatments #101

Open myrmoteras opened 1 year ago

myrmoteras commented 1 year ago

@gsautter is there a way to see, which treatments passed to SIB-TaxPub? Right now, I can only find data about articles, but now how many treatments of an article have passed.

see eg https://tb.plazi.org/GgServer/dtpStats/stats?outputFields=doc.docUuid+doc.docName+doc.updateDateTime+transits.dest+docDoc.articleUuid+docBib.title+docBib.source&groupingFields=doc.docUuid+doc.docName+doc.updateDateTime+transits.dest+docDoc.articleUuid+docBib.title+docBib.source&orderingFields=-doc.updateDateTime&FP-doc.updateDateTime=%222023-08-25%25%22-&FP-transits.dest=%22SiB-TaxPub%22&format=HTML

for example, it is not clear, whether Phytochemistry FFBA981D6C02204D1374FFFFBE19FFC9 is in SIBiLS as treatment or article.

May be add a (couple of ) fields to the transit stats, which could also include for treatments the verbatim taxon name.

d

gsautter commented 1 year ago

If you "count distinct values" of the Data Detail ID field (same field group as you select Transit Destination from), you get the number of treatments that were exported for that article: https://tb.plazi.org/GgServer/pdsStats/stats?outputFields=doc.docUuid+doc.name+doc.updateDateTime+bib.title+bib.source+docTransits.detailId+docTransits.dest&groupingFields=doc.docUuid+doc.name+doc.updateDateTime+bib.title+bib.source+docTransits.dest&orderingFields=-doc.updateDateTime&FP-doc.updateDateTime=%222023-08-25%25%22-&FP-docTransits.dest=%22SiB-TaxPub%22&format=HTML

gsautter commented 1 year ago

If you "count distinct values" of the Data Detail ID field (same field group as you select Transit Destination from), you get the number of treatments that were exported for that article: https://tb.plazi.org/GgServer/pdsStats/stats?outputFields=doc.docUuid+doc.name+doc.updateDateTime+bib.title+bib.source+docTransits.detailId+docTransits.dest&groupingFields=doc.docUuid+doc.name+doc.updateDateTime+bib.title+bib.source+docTransits.dest&orderingFields=-doc.updateDateTime&FP-doc.updateDateTime=%222023-08-25%25%22-&FP-docTransits.dest=%22SiB-TaxPub%22&format=HTML

The reason the field is called "Data Detail ID" is that while it's the UUID of a treatment in case of the SIBiLS export (as well as LOD, OpenBioDiv, etc.), in case of Zenodo it can also be a figure (caption) UUID, in case of RefBank it's a bibRef UUID, and in case of TPS it's a table (caption) UUID ... hence the field name hinting to "some detail extracted from the document, smaller than the document proper" ... the "Data Detail Label" usually provides a pretty good hint as to what exactly got exported, as does the combination of "Transit Source" and "Transit Destination".