NHMDenmark / Mass-Digitizer

Common repo for the DaSSCo team
Apache License 2.0
1 stars 0 forks source link

Digitizing forma names #452

Closed jlegind closed 2 months ago

jlegind commented 8 months ago

What is the issue ?

An Vascular plants export came in with forma names stated in the remarks (notes) field. An existing name was entered and the forma distinction was added in notes.

Detailed description of the issue.

As interpreted by JKL, I think the digitizer became confused by the [f. name] appearing after Genus and species. The digitizer then decided to input the f. name into notes which is not desirable. When a known name is entered, the taxonspid column is set with an identifier and this trips up post processing so that the new-forma-flag is not set. NHMD_Herba_20231108_16_16_SS.csv

The notes field also have other information that might be unsuitable for this field such as "ikke TBU-reg."

Why is it needed/relevant ?

It is far too involving to fix this issue in post processing.

Estimate level of effort required.

Easy

What is the expected acceptable result.

Taxonomic names that comply with the app design and workflow. A forma name not appearing in the auto-suggest box should be typed in as it is on the label.

PipBrewer commented 8 months ago

@chelseagraham Chelsea - can you look at this spreadsheet, resolve the errors/issues and report back? I've already had a look at it, so feel free to come and chat to me about it.

jlegind commented 8 months ago

I would urge the digitizer team to not include test data in the export. If you have the export locally, it is this one: NHMD_PinnedInsects_20231024_16_57_SS.csv

chelseagraham commented 8 months ago

Looked at exported SQL file NHMD_Herba_20231108_16_16_SS.csv

Editing undertaken and saved in the file NHMD_Herba_20231108_16_16_SS_inprogress.csv

Deleted TBU comments (Digitizer entered this information for folders that only have the genus level, but the specimens have been determined and have both genus and species level. Since this information is captured on the specimen labels, I deleted it from the notes field)

Edited one instance of 'label obscured' (invaild value) to 'label obstructed' (controlled vocabulary)

The following entries need manual editing:

specimen and label obstructed = specimen obstructed; label obstructed (as two remarks in Specify) for the following entry 942729

f.tubiflorum = Chrysanthemum leucanthemum f. tubiflorum for the following entries 942700 942701 942702 (accepted synonym Chrysanthemum leucanthemum f. tubiflorum via Plants of the World Online https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:77180150-1)

Check with Collections Manager for the following entries before manual editing:

f. flosculosum = Chrysanthemum parthenium f. flosculosum for the following entries 942948 942949 942950 942951 942952 942953 942954 (accepted synonym Chrysanthemum parthenium var. flosculosum via Plants of the World Online https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:252460-1)

Images for the following entries need checking before manual editing:

f.flosculosum = Chrysanthemum segetum f. flosculosum for the following entries 942955 942956 942957 942958 942959 942960 942961 942962 942963 942964 942965 942966 942967 942968 942969 942970 942971 (not seeing this synonym on Plants of the World Online https://powo.science.kew.org/taxon/urn:lsid:ipni.org:names:209007-1)

RebekkaML commented 5 months ago

To me it looks like the Chrysanthemum segetum f. flosculosum (942955-942971) is a mistake caused by not deleting "f. losculosum" in the notes field. This would explain why it doesn't exist in Plants of the world online. But it is difficult to confirm this from the images alone, since the correct forma etc. is often not given on the labels but only on the folder, which we don't image.

RebekkaML commented 2 months ago

The issues were fixed manually and the sheet imported into Specify.