gbif / vocabulary

A simple registry of controlled vocabularies used for terms found in GBIF mediated data.
Apache License 2.0
6 stars 1 forks source link

NomenclaturalStatus - curation before uploading first vocabulary version #81

Open ManonGros opened 3 years ago

ManonGros commented 3 years ago

Here is a file to edit: https://drive.google.com/file/d/1i5i8apmLtIHiBztDltKoNfRXQc7SbrJz/view?usp=sharing

It contains:

Pease check instructions here: https://github.com/gbif/vocabulary/issues/70

CecSve commented 2 years ago

@timrobertson100 or @ManonGros do any of you know where we got the concepts from on this one?

timrobertson100 commented 2 years ago

From the original Java enumeration

 * @see <a href="http://dev.e-taxonomy.eu/trac/wiki/NomenclaturalStatus">EDIT CDM</a>
 * @see <a href="http://wiki.tdwg.org/twiki/bin/view/UBIF/LinneanCoreNomenclaturalStatus">TDWG LinneanCoreNomenclaturalStatus</a>
 * @see <a href="http://www.biologybrowser.org/nomglos">Nomenclatural Glossary for Zoology</a>
 * @see <a href="http://www.northernontarioflora.ca/definitions.cfm">Northern Ontario plant database</a>
 * @see <a href="http://rs.gbif.org/vocabulary/gbif/nomenclatural_status.xml">rs.gbif.org vocabulary</a>
 * @see <a href="http://darwin.eeb.uconn.edu/systsem/table.html">Nomenclatural equivalences</a>

Please do bear in mind that it was created for the checklistbank (i.e. was meant for taxonomic, regional and other lists of species) rather than for occurrence data, so may not be an appropriate set of terms, or considered useful to try and standardise at all on occurrences. I'd suggest forming an opinion based on the seen values on occurrences.

ManonGros commented 2 years ago

https://api.gbif.org/v1/enumeration/basic/NomenclaturalStatus

timrobertson100 commented 2 years ago

Just in case it's not known - that API is a representation of the Java enum. All (ha!) our enums have that.

I had understood the question more as - how did those concepts come about in the first place - and the list of reference sites from the enum documentation would be the likeliest with a few additions over time. This one was started circa 2008.

CecSve commented 2 years ago

Yes it was a question on how those concepts were chosen, because the verbatim values really does not translate into those concepts - but that is probably because, as you said, they were made for the checklistbank and not occurrences. The verbatim values more translate into how the name was assigned or the certainty of the name giving

ahahn-gbif commented 2 years ago

I would also have expected a nomenclatural status with checklist data (relating to the names as such, not to the way that names apply to occurrences). That being the case, the verbatim data mapped to this concept are possibly there due to a misunderstanding of the term, rather than needing to be handled in our interpretation of this concept. For occurrence data, I would expect the concept of identificationQualifier (cf., aff., possibly including s.str.(?)) to be way more relevant.

CecSve commented 2 years ago

@ahahn-gbif I have talked to some people at the museum and they agree that nomenclaturalStatus refers to names and not occurrences.

CecSve commented 2 years ago

From the terms we get from occurrences it seems that the majority of values comes from erroneuos mapping - the values would fit better under identificationQualifier since they relate to the use of the name and not the status of the name.

It is questionable whether the nomenclatural status should be part of the occurrence core at all.

CecSve commented 2 years ago

Typing this so we remember the feedback from the nomenclature people:

Sensu stricto and sensu lato (examples of verbatim values) etc would generally also not be used in a checklist setting. Sensu stricto (in strict sense) is used for subgenera to define the subgenera that is named after the genera (which there always is one subgenus that is), e.g. Carabus carabus. Sensu lato is always in context of something that needs to be defined - so the name usage is the broad definition of the species in a given context, so this is typically used in publications and is meaningless in this field if we do not get the context of the 'broad sense' understanding of the name.