Open punkish opened 3 years ago
That discrepancy is simply because the dumps do not include ZooBank stubs, i.e., placeholders that indicate that according to ZooBank, there must be a treatment, but we don't have it (yet), and all we know is the taxon name and the metadata.
@gsautter : I'm currently working to improve ZooBank services, and have been thinking a lot about how useful it would be to get back to coordinating efforts again (the lapse is 100% my fault -- just too distracted by other things!)
I can think of several things that could help:
I can't promise hundreds of hours of time, but if there are things that would help the integration process I can certainly start developing those things.
That discrepancy is simply because the dumps do not include ZooBank stubs, i.e., placeholders that indicate that according to ZooBank, there must be a treatment, but we don't have it (yet), and all we know is the taxon name and the metadata.
ok, thanks for the explanation.
My suggestion would be to modify the stats js (pubAndTreatmentCounts, pubAndTreatmentCountsWithStubs, materialsCitationCount, materialsCitationGeoRefCount) so the numbers are the same all across the board. That is, to not include the placeholders until you actually get the treatment. Having different numbers is confusing and a user can start wondering why TB is showing a different number from Zenodeo or other websites/apps that might use the XML data.
cc @myrmoteras
hi @gsautter
last night I downloaded all the archives (full, monthly, weekly, and daily) and expanded them into a directory. I got the following numbers
I inserted the data into my db and got the following numbers
But when I look at the stats on plazi.org, I see Treatments: 715061. What is the explanation for that almost 92K discrepancy in the number of treatments? What should I get and what should be reported on plazi.org?
cc @myrmoteras