gbif / checklistbank

GBIF Checklist Bank
Apache License 2.0
30 stars 14 forks source link

degreeOfEstablishment & pathway not exposed in distributions API #294

Open peterdesmet opened 11 months ago

peterdesmet commented 11 months ago

We have just updated a dataset to make use of the terms degreeOfEstablishment and pathway (added to Darwin Core in 2020) as part of the species distribution extension:

✅ IPT allows to map those fields (thanks to https://rs.gbif.org/extension/gbif/1.0/distribution_2022-02-02.xml): they are included in Darwin Core Archive ✅ Data are harvested by GBIF ✅ Terms are included in verbatim API: https://api.gbif.org/v1/species/141117231/verbatim ("http://rs.tdwg.org/dwc/terms/pathway": "aquacultureMariculture" and "http://rs.tdwg.org/dwc/terms/degreeOfEstablishment": "casual") ❌ Terms are included in the distribution API: https://api.gbif.org/v1/species/141117231/distributions (the two terms are missing)

The missing terms are currently a blocker for updating our Global Register of Introduced and Invasive Species - Belgium.

What could be the reason the terms are not included in the API response? Diving through the issues in this repo, the harvesting/processing of the terms seems tackled in gbif/pipelines#581 and gbif/pipelines#648.

When resolved, related portal-feedback issue https://github.com/gbif/portal-feedback/issues/3257 can likely be closed too.

timrobertson100 commented 11 months ago

Moving to the checklistbank project where the species datasets are processed

timrobertson100 commented 11 months ago

What could be the reason the terms are not included in the API response?

Simply that we haven't updated the GBIF checklistbank that drives the species API to use them as our work is focused more on the checklistbank.org which is an evolution. The distribution extension fields are currently here

Knowing this is a blocker for you and GRIIS is helpful for priority setting - thanks.

Note that the name is in checklistbank.org as well here but only shows in the verbatim view

peterdesmet commented 11 months ago

I just noticed that occurrenceStatus (present/absent) is not exposed for some records either:

https://api.gbif.org/v1/species/159498294/distributions:

{
"taxonKey": 159498294,
"locationId": "ISO_3166-2:BE-VLG",
"locality": "Flemish Region",
"country": "BE",
"temporal": "2018/2018",
"establishmentMeans": "INTRODUCED"
}

vs https://api.gbif.org/v1/species/159498294/verbatim

{
"http://rs.tdwg.org/dwc/terms/countryCode": "BE",
"http://rs.tdwg.org/dwc/terms/locationID": "ISO_3166-2:BE-VLG",
"http://rs.tdwg.org/dwc/terms/occurrenceStatus": "present",
"http://rs.tdwg.org/dwc/terms/establishmentMeans": "introduced",
"http://rs.tdwg.org/dwc/terms/eventDate": "2018/2018",
"http://rs.tdwg.org/dwc/terms/locality": "Flemish Region"
}
peterdesmet commented 11 months ago

We are investigating if we can get the complete (but uninterpreted) data from the verbatim API for now https://api.gbif.org/v1/species/159498294/verbatim. Will keep you posted.

damianooldoni commented 11 months ago

About occurrenceStatus, maybe worth noticing that this problem was already mentioned in an issue: https://github.com/gbif/gbif-api/issues/94. It seemed to be solved, but actually it isn't for taxa with one distribution only.