NHMDenmark / Mass-Digitizer

Common repo for the DaSSCo team
Apache License 2.0
1 stars 0 forks source link

NHMA taxonomy plan #348

Closed jlegind closed 1 year ago

jlegind commented 1 year ago

Issue

Create a plan for converting the Aarhus taxonomy which is ID based into something that can supplement the existing Digi App taxonomy spine.

It is relevant because certain collections rely on an abstract 'code' based taxonomy on their containers (shelves, drawers, boxes, trays etc.)

The effort required is in the range of one or two weeks.

Plan/solution:

The goal is to make it easy for the NHMA digitizers to input a number rather than a name in the taxonomic section of the app. There should be a small input field next to the 'taxonomic name' input field where the ID ("Sortnr.") can be typed. This field should only become visible to NHMA Entomology collection users. Upon completion of the ID input a label with the taxonomic name should appear in the taxonomic name field in the UI form itself. This is a small check on the ID typed.

What is the expected acceptable result?

There should minimum be an option of inputting the NHMA taxon ID into the App which resolves to a taxon name. The taxonname table should have a column for the abstract taxonomic ID added.

jlegind commented 1 year ago

This issue awaits a chat with Ole Karsholt about some of the weird columns.

PipBrewer commented 1 year ago

Additional info from Thomas Simonsen (NHMA) about the info on the front of pinned insect drawers

SV_ Taxon numbers.pdf KArsholt Nielsen 2013.pdf

jlegind commented 1 year ago

If user is logged in under NHMA then the Aarhus entomology taxonomy becomes active and the taxonomy input field will expect numbers. That is my suggested solution.

jlegind commented 1 year ago

The transformed NHMA spreadsheet produces 3728 records similar to this:

{'sortnr': 4404, 'superfamily': 'Noctuoidea', 'family': 'Nolidae', 'genus': 'Nycteola', 'species': 'degenerana'}

jlegind commented 1 year ago

While the Aarhus taxonomy has been turned into proper records it still needs a taxonomic lookup to get the spid. There will be a lookup into the DaSSCo taxonomy table to obtain this. Coming soon...

jlegind commented 1 year ago

The processed alternative taxonomy records have been verified against the original spreadsheet. + There is now a table in the SQLite DB which contains the full taxonomic record as well as the spid ID which we will need to look up the name itself in the DaSSCo taxonomy table (taxonname).

jlegind commented 12 months ago

Code for extracting NHMA taxonomy from excel file into a table format: https://github.com/jlegind/code/blob/master/nhma_taxonomy.py

jlegind commented 11 months ago

Please be aware that the "Aarhus_Dk_lepidoptera2013.xlsx" file and its derivatives are not reliable. See issue #407