Open BobSimons opened 3 years ago
Hi Bob, what you suggested definitely sounds useful.
I just wanted to document some information that I found useful here for future reference.
GitHub issue related to this: https://github.com/gbif/pipelines/issues/268
The list of values for occurrenceStatus
that is currently being interpreted into vocabulary in GBIF: https://github.com/gbif/parsers/blob/master/src/main/resources/dictionaries/parse/occurrence_status.tsv
Based on the 2021-05-18 release of the occurrence.csv file, I think that several of the fields would benefit from using a restricted vocabulary or at least a list of preferred terms that would be used whenever possible. For example,
Generating a list of the current values in these fields (with SELECT DISTINCT), and then converting to a restricted vocabulary when possible, seems not too difficult given that the data is in a relational database.
Best wishes.