NHMDenmark / Mass-Digitizer

Common repo for the DaSSCo team
Apache License 2.0
1 stars 0 forks source link

New taxon name that includes author name prevents rank determination. #484

Closed FedorSteeman closed 4 months ago

FedorSteeman commented 4 months ago

If the full name of a novel taxon (i.e. unknown to the taxon spine) is entered including any author names, this would mess up the code trying to guess the taxon rank. Leaving out the author name will make it harder to precisely identify the taxon in question. So there should be some way to include author name without causing problems for rank guessing.

FedorSteeman commented 4 months ago

I cannot think any clever solution that would not require special action from the side of the digitizer... This is because the author field is quite unpredictable as far as number of tokens and punctuation.

There are the following options:

  1. Add an extra input field for the author name alone
  2. Instruct the digitizer to separate author from the rest of the full taxon name with an underscore.
  3. Simply have a dropdownlist added from which the digitizer can choose the taxon rank

The latter is probably the most fail-safe.

@bhsi-snm Please evaluate these suggestions.

bhsi-snm commented 4 months ago

I have discussed it with @chelseagraham. She thinks that underscore + taxon rank dropdown is the best solution, same as you but I think for now we can live with the underscore because we can easily incorporate that (only the _before authorname). It is easiest to implement and would be a minor change in the digitiser's workflow.

FedorSteeman commented 4 months ago

Fixed with latest commit.

@bhsi-snm @chelseagraham Make sure to instruct digitisers that if they add the authorship to a novel/unknown taxon name that this is separated by an underscore and an underscore only. Do not add spaces around the underscore character; It is supposed to be a replacement for a space.

Also note that once the novel name is parsed, the underscore is removed and replaced by a space character.

chelseagraham commented 4 months ago

Just to confirm, this information should be written as Taxon name_Author, year? For example, "Ancylis paludana_Barrett, 1871"?

bhsi-snm commented 4 months ago

@FedorSteeman can you please give a couple of examples to explicitly show how underscore is supposed to be used ?

FedorSteeman commented 4 months ago

@bhsi-snm Exactly as @chelseagraham surmised.

chelseagraham commented 4 months ago

Just as an update: I have communicated to the digitizers the protocol for when encountering new taxon. They informed me that often the folders do not include the year, only author.