tdwg / tcs2

The TCS 2 Task Group will turn TCS into a form in which it can be maintained. The new version of TCS will be a vocabulary standard like Darwin Core and Audiovisual Core and will complement these other existing TDWG standards.
6 stars 0 forks source link

property:kindOfName #42

Open nielsklazenga opened 3 years ago

nielsklazenga commented 3 years ago

kindOfName (property)

Label Kind of name
Definition The kind of name, e.g. scientific, hybrid formula, informal
Usage notes
Comments
Required No
Repeatable No
Constraints Vocabulary
nielsklazenga commented 3 years ago

This had disappeared from the Google Doc somehow, but I have dug it up from the revision history.

I am personally not a fan (which may explain ↑), but a term like this has been requested also from outside the interest group. I think it will be near-impossible to come up with a controlled vocabulary for this term that people are going to agree on and I wonder how useful the term will be without a controlled vocabulary.

afuchs1 commented 3 years ago

An attribute such as this is difficult to get consensus on but it is useful in terms of building understanding around the name ie. how constructed, format etc. Rather than making it a controlled vocab is there a set of terms we could propose - over time these tend to become defacto standards at which point they can become a controlled vocab. Putting it out there - we use the following which are then categorised as scientific, cultivar, formula, hybrid via boolean flags. (these are specifically for ICN and may have additional name types for other domains)

"name type","scientific","cultivar","formula","hybrid" "phrase name","true","false","false","false" "sanctioned","true","false","false","false" "scientific","true","false","false","false" "hybrid formula parents known","true","false","true","true" "hybrid formula unknown 2nd parent","true","false","true","true" "intergrade","true","false","true","true" "named hybrid","true","false","false","true" "autonym","true","false","false","false" "hybrid autonym","true","false","false","true" "named hybrid autonym","true","false","false","true" "common","false","false","false","false" "[default]","false","false","false","false" "informal","false","false","false","false" "[n/a]","false","false","false","false" "[unknown]","false","false","false","false" "vernacular","false","false","false","false" "cultivar","false","true","false","false" "graft/chimera","false","true","true","false" "cultivar hybrid","false","true","false","true" "cultivar hybrid formula","false","true","true","true"

cboelling commented 3 years ago

Are documented (possibly preliminary) definitions available for the different name types mentioned?

mdoering commented 3 years ago

GBIF & COL share a vocabulary for NameType that has some definitions: https://github.com/gbif/name-parser/blob/master/name-parser-api/src/main/java/org/gbif/nameparser/api/NameType.java#L21

  /**
   * A scientific latin name that might contain authorship but is not any of the other name types below (virus, hybrid, cultivar, etc).
   */
  SCIENTIFIC,

  /**
   * A virus name.
   */
  VIRUS,

  /**
   * A hybrid <b>formula</b> (not a hybrid name).
   */
  HYBRID_FORMULA,

  /**
   * A variation of a scientific name that either adds additional notes or has some shortcomings to be classified as
   * regular scientific names. Frequent reasons are:
   * - informal addition like "cf."
   * - indetermined like "Abies spec."
   * - abbreviated genus "A. alba Mill"
   * - manuscript names lacking latin species names, e.g. Verticordia sp.1
   */
  INFORMAL,

  /**
   * Operational Taxonomic Unit.
   * An OTU is a pragmatic definition to group individuals by similarity, equivalent to but not necessarily in line
   * with classical Linnaean taxonomy or modern Evolutionary taxonomy.
   * <p>
   * A OTU usually refers to clusters of organisms, grouped by DNA sequence similarity of a specific taxonomic marker gene.
   * In other words, OTUs are pragmatic proxies for "species" at different taxonomic levels.
   * <p>
   * Sequences can be clustered according to their similarity to one another,
   * and operational taxonomic units are defined based on the similarity threshold (usually 97% similarity) set by the researcher.
   * Typically, OTU's are based on similar 16S rRNA sequences.
   */
  OTU,

  /**
   * A placeholder name like "incertae sedis" or "unknown genus".
   */
  PLACEHOLDER,

  /**
   * Surely not a scientific name of any kind.
   */
  NO_NAME;
deepreef commented 3 years ago

I like the broad categories posted by @mdoering , but I would suggest some modifications. I think SCIENTIFIC, VIRUS, and HYBRID_FORMULA are pretty clean categories as such, but I think there would be some value in parsing out and/or merging INFORMAL and OTU.

In many cases, qualifiers like "cf." are properties of an Identification instance, and are mistakenly included within the scientificName. I think those require a separate category, like QUALIFIED_NAME.

I would argue that examples like "Abies spec." should similarly be treated as a QUALIFIED_NAME, in the sense that the "spec." or "sp." part is not actually part of the name, but is a qualification of the name that doesn't really add any value. In these cases, the actual identified taxon is "Abies" [genus], and "spec." is just a meaningless throw-away qualifier.

I think that abbreviated genus names are either a category of their own, or are really just a form of SCIENTIFIC name. If you want to flag them as separate, then something like AMBIGUOUS_SCIENTIFIC might be a good label for the NameType.

The term "manuscript names" means something other than examples like "Verticordia sp. 1". The former is a term that applies to names that look in all ways like SCIENTIFIC names (e.g., full genus and species), but were never published in accordance with the Code. Zoologists would call these "Unavailable" names, and botanists would call them "invalidly published".

By contrast, I think "Verticordia sp. 1" deserve their own NameType. I call them SEMISCIENTIFIC names, because they are multi-part names (like binominals), but only part of the name follows the SCIENTIFIC pattern, and the other part represents a non-scientific component that is a placeholder for the absence of a species (or subspecies or variety or whatever) part. I think these should be treated differently from the other INFORMAL names because they are effectively placeholders for names that would otherwise be treated as SCIENTIFIC. They share some similarities with "manuscript names" in that regard, except I think most uses of that term ("manuscript names") apply to well-formed Latin binomonals that are not Code-compliant, whereas these are not well-formed Latin binominals.

I'm not sure exactly what sets OTU names apart. Are these things like BOLD BIN identifiers?

In any case, I think the list provided by @afuchs1 serves a somewhat different function, and represents a mixture of nomenclatural and taxonomic categories. I think a more general categorization more similar to that shared by @mdoering (modified, as suggested above) is a good first-start, then some of the other items on the list from @afuchs1 could be considered as subcategories of the the broader categories?

ghwhitbread commented 3 years ago

@afuchs1 list is only used within the NSL to build names - it has no nomenclatural, or taxonomic purpose. “sanctioned” will be removed because it is now an ICN nomenclatural status term (bundled into “scientific”) and there are a bunch (flags = false,false,false,false), and only separated for application purposes, that could be bundled. “Phrase names” are equivalent to your “semi-scientific” category, the rest should be self-explanatory.

afuchs1 commented 3 years ago

@ghwhitbread true they are application based, they are also used to categorise searching/excluding categories of data. Which comes back to what is the purpose of having "kind of name"

For example, a process I am hoping TNC will support is something along the lines of

nielsklazenga commented 3 years ago

@cboelling, most of the kinds of names mentioned here will be defined in the nomenclatural codes and/or Hawksworth 2017 . For the extra kinds of names that @deepreef suggests, if we want to have them in the vocabulary, we will have to come up with definitions.

mdoering commented 3 years ago

The main purpose for me to create the GBIF name type vocabulary was to be able to parse and deal with non parsable names. The different types mostly define very different syntactical structures.

The term "manuscript names" means something other than examples like "Verticordia sp. 1". The former is a term that applies to names that look in all ways like SCIENTIFIC names (e.g., full genus and species), but were never published in accordance with the Code. Zoologists would call these "Unavailable" names, and botanists would call them "invalidly published".

True, and usually you see some nomenclatural note like inded. or ms. at the end of the scientific name. I guess these placeholders (used a lot in molecular based works) are often the predecessor for proper latin manuscript names. Australian Herbaria propose to still classify them as informal names: https://www.anbg.gov.au/chah/phrase-names/index.html

Australian Western Herbarium has a similar broad list of name types, based on Chapman (2000): https://florabase.dpaw.wa.gov.au/help/names

I think it is very useful to agree on a broad list of name types to exchange that information safely.