NHMDenmark / Herbarium-Sheets-workstation

Workstation and workflows for herbarium sheets for mass digitisation (DaSSCo)
0 stars 0 forks source link

Update Mass Digitization Guide to include protocol for novel taxon determinations #85

Closed chelseagraham closed 5 months ago

chelseagraham commented 8 months ago

In response to https://github.com/NHMDenmark/Mass-Digitizer/issues/484#issuecomment-1956430702 it has been decided that when entering a new taxon, it is necessary to enter the author information in the notes field in a specific manner:

Taxon name_Author, year

For example, Ancylis paludana_Barrett, 1871

Please incorporate this protocol into the Mass Digitization Guide

nms419 commented 8 months ago

What is the protocol if author and year is not written on the label/folder? Does the digitiser need to find author name using a specific database or can the digitiser just forego it? If theres only part data, can the digitiser write only "_Author" or ", year"?

nms419 commented 8 months ago

Currently the protocol has been included like this: image

chelseagraham commented 8 months ago

What is the protocol if author and year is not written on the label/folder? Does the digitiser need to find author name using a specific database or can the digitiser just forego it? If theres only part data, can the digitiser write only "_Author" or ", year"?

@bhsi-snm @PipBrewer is partial data helpful, or should this be skipped unless all is present? I don't think investigation of the information from a database is appropriate at this stage of digitization, but partial data could be helpful for completing filling in this information at a later stage.

PipBrewer commented 8 months ago

If there is only one taxon that matches then I think you can assume that whatever it says for author is correct. However, this does need to be raised at the monthly meetings with curators/collection managers (can you add that to the agenda @chelseagraham?)

If there are multiple options, I would guess that add an underscore after the taxon name? What do you think @bhsi-snm? @AstridBVW needs to know to look out for these things. I guess, she (or whoever is helping with data / checking spreadsheets) would need to follow up prior to import.

chelseagraham commented 7 months ago

@PipBrewer I can add this for the next meeting, but I am not sure why we need input from CMs about this. I think I don't understand it entirely.

If the folder only has the author without a year, or the year without an author, the digitizers are wondering if they should write "taxon name_author" or "taxon name_year", if this information will be readable by Bhupjit since it doesn't have the same composition as what Bhupjit requested "word wordword, number" and if this partial information is even helpful. Maybe let's chat about it. I don't understand the taxon name you mention above.

PipBrewer commented 7 months ago

@chelseagraham Whatever goes after the underscore goes in the taxon author/year column in the sqllite database in its entirety. You can see this when testing and playing around with the app - I can show you. If it is clear that it is the correct taxon, for now just select the relevant one when digitising even if it differs slightly (e.g., only the author name is present instead of author and year). If it may be a different taxon, enter it as on the folder. I have chatted with @RebekkaML and we need to chat to curators and collection managers as to how specific they want the author/year to be written and then figure out a way to manage that without it eating into digitisation time. If we enter it exactly and it differs from what is in Specify, it will create a new taxon record (a duplicate) and things start to get messy. I think this is one for me and the data team to figure out a way forward on. FYI @bhsi-snm

RebekkaML commented 6 months ago

this issue is related to #90 and will be updated once that is resolved.

RebekkaML commented 5 months ago

90 was resolved, the solution for discrepancies between author names in the dropdown menu vs. on the folder is this:

Same author, but different spelling / abbreviation Example: Sieb. / Siebold Solution: we take the spelling that is already in Specify/GBIF.

No author name on the folder Example: L. / --- Solution: Leave it blank. If a record without author name already exists in Specify, we pick that. Otherwise we create a new record, this then gets flagged for checking.

Partially different author names Example 1: Lej. / Lej. & Court Example 2: L. / (L.) L. Solution: Create a new record with what is on the folder, this then gets flagged for checking.

Completely different author names Example: L. / Crantz Solution: Create a new record with what is on the folder, this then gets flagged for checking.

This can now be added to the guide.

RebekkaML commented 5 months ago

I created an Issue for the next quarterly guide update (https://github.com/NHMDenmark/Mass-Digitizer/issues/515) and linked this issue there. This thread can be closed now.