gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

Add typifiedName #1065

Open timrobertson100 opened 1 month ago

timrobertson100 commented 1 month ago

I've had an approach from a GBIF publisher:

The CETAF ISTC are interested in using this field for work on typification data What is the chance of GBIF implementing typifiedName?

This is a likely candidate to pass the next round of DwC review, and so I suggest we add it to the next batch of edits we make to the ingestion (in a gbif or dwc namespace depending on the state).

qgroom commented 1 month ago

This will be really useful to clean up type specimens on GBIF and link them to their nomenclatural details and literature. We have publishable data ready for this field, and I'm sure many other collections have too.

qgroom commented 1 month ago

It will probably come up in the Darwin Core public review, but it might be worth thinking about implementing a data quality warning when typifiedName is populated and typeStatus is empty. A warning might also be possible in the other direction. However, it is quite common that collections know a specimen is a type, but are not certain what name it is a type of.

aguentsch commented 1 month ago

It would be great if typifiedName became a standard for GBIF publication. Our type specimens curated on the JACQ network already have the field (called "type of" here), so we could provide the data for typifiedName immediately. see https://www.jacq.org/detail.php?ID=1053795

Rindiser commented 1 month ago

This would be a nice additioin

CecSve commented 1 month ago

It will probably come up in the Darwin Core public review, but it might be worth thinking about implementing a data quality warning when typifiedName is populated and typeStatus is empty. A warning might also be possible in the other direction. However, it is quite common that collections know a specimen is a type, but are not certain what name it is a type of.

There will soon be an update to the typeStatus controlled vocabulary https://github.com/gbif/vocabulary/issues/87 - I am mentioning it here if any synergy is expected for interpretation.

qgroom commented 1 month ago

There will soon be an update to the typeStatus controlled vocabulary https://github.com/gbif/vocabulary/issues/87 - I am mentioning it here if any synergy is expected for interpretation.

Good to know! I don't foresee too much complication except where a specimen is a type for multiple names.

In the long term it would be nice to see a guide written to the best practises for publishing typification data. While it is not that difficult, it would encourage standardization.

CecSve commented 1 month ago

It is probably too difficult to do anything with this in the pipelines, but when we worked on the typeStatus vocabulary (and inspired by this comment), I noticed most of the verbatim values (71%, n = 107140) included an of - in most cases referring to the typifiedName, but not always (e.g. type of type is also counted).