Closed PipBrewer closed 8 months ago
I'm looking forward to upgrading to the latest version of Specify7 so we can have a lot of these stats handy:
But it's on our radar now so we'll get on to it.
@PipBrewer Hereby the statistics you asked for:
NHMD | |
---|---|
Total as of all 2023: | 1.091.257 |
Published as of 2023: | 793.267 |
Total of all time: | 1.116.723 |
Published all Time: | 802.492 |
Added/Updated 2023: | 379.386 |
NHMA | |
---|---|
Total as of all 2023: | 428.871 |
Published as of 2023: | 226.028 |
Total of all time: | 430.896 |
Published all Time: | 228.037 |
Added/Updated 2023: | 7.509 |
AUH | |
---|---|
Total as of all 2023: | 0 |
Published as of 2023: | 0 |
Total of all time: | 0 |
Published all Time: | 0 |
Added/Updated 2023: | 0 |
Also in the attached spreadsheet: DaSSCo Statistics 2023.xlsx
Please review and ask any followup questions you may have.
BTW I got a bit curious about the high number of unpublished records for NHMD and it turns out that not only are these actively set to be "false", most of these come from vascular plants.
Count | Collection name | |
---|---|---|
Published False: | 3 | NHMD Vertebrate Paleontology |
Published False: | 49.563 | NHMD Entomology |
Published False: | 7.503 | NHMD Invertebrate Zoology |
Published False: | 88 | NHMD Mammalogy |
Published False: | 253.542 | NHMD Vascular Plants |
Published False: | 1 | NHMD Danekrae |
Published False: | 5 | NHMD Amber |
Is this deliberate and part of a DaSSCo strategy?
Regarding to the high number of unpublished records for NHMD: 24560 of them were the type database of Vascular plants - they've been published today.
The remaining ones are the "dummy records" reserved for DaSSCo, I guess. (same case for Entomology)
Ooooh Good catch! I will need to redo the statistics then. I forgot all about the dummy records!
Hereby the adjusted numbers:
NHMD | |
---|---|
Period | Record_Count |
Total prior to 2024: | 857.005 |
Published prior to 2024: | 817.153 |
Total of all time: | 874.603 |
Published all Time: | 827.142 |
Added/Updated 2023: | 373.590 |
Only NHMD needed to have dummy records subtracted.
Redone spreadsheet: DaSSCo Statistics 2023.xlsx
FYI I used this SQL: DaSSCo Statistics.sql.txt
@FedorSteeman Thank you for this. I'm sorry that I didn't ask this originally. Do you have the total number of published specimens for all Danish institutions prior to 2024?
As far as I know, we only publish the occurrences of NHMA and NHMD, not the others. So the total number of published specimens for the rest of the Danish institutions prior to 2024 is 0.
The other numbers for the smaller institutes: (also added to the spreadsheet: DaSSCo.Statistics.2023_all.xlsx)
MSJN | |
---|---|
Total as of all 2023: | 5.339 |
Published as of 2023: | 0 |
Total of all time: | 5.367 |
Published all Time: | 0 |
Added/Updated 2023: | 312 |
MUSERUM | |
---|---|
Total as of all 2023: | 23.091 |
Published as of 2023: | 0 |
Total of all time: | 23.116 |
Published all Time: | 0 |
Added/Updated 2023: | 5.847 |
Naturama | |
---|---|
Total as of all 2023: | 10.543 |
Published as of 2023: | 0 |
Total of all time: | 10.543 |
Published all Time: | 0 |
Added/Updated 2023: | 0 |
OESM | |
---|---|
Total as of all 2023: | 12.787 |
Published as of 2023: | 0 |
Total of all time: | 12.851 |
Published all Time: | 0 |
Added/Updated 2023: | 1.826 |
FIMUS | |
---|---|
Total as of all 2023: | 2.137 |
Published as of 2023: | 0 |
Total of all time: | 2.139 |
Published all Time: | 0 |
Added/Updated 2023: | 190 |
Hi @PipBrewer ! The only institutions publishing to GBIF currently are NHMD and NHMA so you can just add those numbers up.
Oh the response by @Sosannah wasn't visible when I finally started replying, but thank you, Zsuzsanna!
Presumed done. Will reopen if needed.
Quick question - can you separate out number of records where NHMD is the publisher as opposed to NHMD is 'hosting' the records on behalf of other institutions?
@jlegind The records published by Specify are the ones hosted by NHMD. NHMA is the only other institution which has its records published.
@FedorSteeman @Sosannah Thank you for these numbers. After adjusting for dummy records Fedor quotes NHMD figures as 857,005 in Specify of which 817,153 are published to GBIF. Zsuzs quotes (in her updated spreadsheet) 1,091,257 in Specify of which 793,267 are pushed to GBIF. I'm guessing that Zsuzs you didn't update the NHMD figures in your spreadsheet, only the numbers for the other institutions and so I should use Fedor's figures for NHMD and yours for the rest?
Do we have reasons why there are unpublished records for NHMD? Is there a reason why we don't publish records for other institutions?
@PipBrewer We may need to run these numbers through another iteration to make them as accurate as possible and perhaps compare to GBIF.
The reason for unpublished records can vary on a case-per-case basis. It is best to ask to responsible curators. For many, there may no longer be any reason for keeping them from the public. Typical reasons are waiting for a paper to be published, or the material or its locality being of a sensitive nature.
The reason that other institutions are not publishing is simply that we have not initiated them for them and the ones that were asked have not been interested in it yet.
Oops, you're right, I added my numbers with the small institutes to the wrong sheet. And also right, I didn't update the NHMD/NHMA figures - you can use Fedor's figures for NHMD/NHMA and mine for the rest. Sorry about that!
But these numbers are growing constantly - since then the 24.560 type specimen records were also published, and Jen is importing new sheets quite regularly, so if the cut-point is not strictly defined, then a new calculation could make sense, as Fedor suggested.
Other reasons of unpublished records for NHMD:
Thanks. I'm mostly interested in figures as of 01/01/2024 for now, so not worried about growing numbers. Cheers!
I need to write the annual report for DaSSCo. Hence, would it be possible to have some statistics from Specify?
I would like to know for NHMD, AU and NHMA (separately):
Ideally, I would like these by the end of February so that I have time to generate a narrative around them (comparing them to previous years and projections) and to have others (such as the DaSSCo Steering Group) check the report before I submit it at the end of March 2024.