iobis / obis-issues

Repository for all OBIS related issues and feature requests
5 stars 3 forks source link

Use restricted vocabulary when possible #190

Open BobSimons opened 3 years ago

BobSimons commented 3 years ago

Based on the 2021-05-18 release of the occurrence.csv file, I think that several of the fields would benefit from using a restricted vocabulary or at least a list of preferred terms that would be used whenever possible. For example,

Generating a list of the current values in these fields (with SELECT DISTINCT), and then converting to a restricted vocabulary when possible, seems not too difficult given that the data is in a relational database.

Best wishes.

ymgan commented 3 years ago

Hi Bob, what you suggested definitely sounds useful.

I just wanted to document some information that I found useful here for future reference.

GitHub issue related to this: https://github.com/gbif/pipelines/issues/268

The list of values for occurrenceStatus that is currently being interpreted into vocabulary in GBIF: https://github.com/gbif/parsers/blob/master/src/main/resources/dictionaries/parse/occurrence_status.tsv