gbif / parsers

Various GBIF parsers for dates, countries, language, taxon ranks, etc
Apache License 2.0
4 stars 8 forks source link

occurrenceStatus parser: REPORTED -> PRESENT #15

Closed peterdesmet closed 5 years ago

peterdesmet commented 5 years ago

I noticed that the occurrenceStatus parser interprets reported as EXCLUDED. I think that should be PRESENT, because reported doesn't mean erroneous. The definition for excluded is:

Subclass of absent: The organism is reported insome (gray) literature for a certain region, but is is erroneous. Reason for exclusion could be a misidentification, an old report, a simple publishing mistake or any other or unknown reason.

Proposed action:

  1. Change parsing for reported to PRESENT
  2. Add value reported in error and map to EXCLUDED. That way, all 3 alternatives for EXCLUDED listed at http://rs.gbif.org/vocabulary/gbif/occurrence_status.xml are indeed mapped.

This updated mapping is imported for unified checklist of alien species in Belgium: https://github.com/trias-project/unified-checklist/issues/37#issuecomment-493006247

peterdesmet commented 5 years ago

Awesome! Thanks Matt. Could you trigger a reprocessing for https://www.gbif.org/dataset/0a2eaf0c-5504-4f48-a47f-c94229029dc8?

MattBlissett commented 5 years ago

Hi Peter, it's not yet deployed (and one hour before home-time before a public holiday in Denmark is not when we do deployments). Check back next week :)

peterdesmet commented 5 years ago

No problem. Will the checklist be reprocessed automatically or does it need to be triggered?

MattBlissett commented 5 years ago

Since the verbatim data hasn't changed, I'll need to manually reprocess anything with one of the affected values.

peterdesmet commented 5 years ago

Hi Matt, checking back. 😄 Can you reprocess this dataset?

timrobertson100 commented 5 years ago

I just initiated a crawl (note that this is a checklist though, and I think the deployment was only the occurrence pipeline)

mdoering commented 5 years ago

I have released a new version of checklistbank with the changed parsers, but I haven't deployed it yet

peterdesmet commented 5 years ago

Thanks! Seems resolved for the dataset in question.

  1. https://www.gbif.org/species/157128772/verbatim
  2. Cmd + F for http://marineregions.org/mrgid/26567
  3. See second distribution (with eventDate: 2000 and occurrenceStatus: reported)
  4. See https://api.gbif.org/v1/species/157128772/distributions
  5. Cmd + F for http://marineregions.org/mrgid/26567
  6. That distribution is listed with occurrenceStatus: PRESENT

=> Mapping works

Referencing these distributions would be easier if they had an id. 😄