gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
29 stars 16 forks source link

inaturalist includes a filter to show GBIF datapoints on their species maps, but this filter is missing for some species or shows GBIF data from the species entire genus, which can cause misidentifications. #5218

Open bdagley opened 4 months ago

bdagley commented 4 months ago

According to a retired biochemist inaturalist curator's understanding (which I haven't verified, but seems possibly accurate):

The issue with GBIF resulted from GBIF renumbering many taxa in their database. iNaturalist doesn't have a way to automatically update the taxa that have renumbered, and hence the geolocation data points on iNat maps were misreferenced. Our admins can't address the mismatch because it originates outside our system. There's a way to correct it on the iNat taxon page by manually revising the taxon's GBIF number in the Taxonomy Scheme. It's real busy-work, requiring no real knowledge, so I plugged away at it for a little while. Then I lost interest. I guess nobody said thank you. My solution to the forum atmosphere was simply to stop going there.

This issue is known to the inaturalist staff from discussions on their website and forum, where it received little to no response.

Note: I originally considered posting this to the inaturalist repository, but since given the above quote it may originate on GBIF, I added it here instead. Yet, I'd actually consider this (like some other Issues) a GBIF-inaturalist issue, meaning that it's an issue that could potentially have been prevented, or now be fixed, via the websites communicating about it before making the changes (or anticipating that this would occur).

For my suggested solutions, in an ideal world the entire issue would become fixed, but it's unfeasible for inaturalist forum curators (on their own, without staff or developer aid) to manually fix each such taxon page. Regardless, the highest priority would be to fix the maps that show entire genus GBIF data points for what's actually only supposed to be a single specie's range. Those in particular woudl be best to fix if nothing else, or, in the event nothing can be fixed, the GBIF filters from those taxa pages could potentially be ideal to delete/remove (but that would require discussion). Lastly, note that whether or not this is a true bug, it's a justified issue as at least being a bug-like problem.

MattBlissett commented 4 months ago

Hi,

Thanks for the detailed issue.

I see a few threads on the iNaturalist forum around this, e.g. here. This one lists Schinia species, which are/were often linking to the Schinia genus in GBIF. The identifier for Schinia (genus) has been 4405135 since July 2012, so I think the suggestion in the previous comment in that discussion — that the genus is used as a template when creating a new species — is probably correct.

Sometimes taxon identifiers in the GBIF backbone are changed, but old numbers should not be reused — they are shown as deleted like this and a map requested using this taxon would be blank.

Since I can't reply to the closed discussion:

Any bot or automated process to fix will have to constantly monitor the gbif database ( and likely thr whole database as it is unlikely any api has access to a change log ). That’s a hugely intensive process and one gbif likely would not approve of.

I can’t imagine any way GBIF is going to tolerate a bot or process constantly hammering their site to check for changes or mismatches on the identifiers.

We have no problem with a bot doing this. However, it's only useful when the backbone has been updated, so it can also be run from the export of the backbone: https://hosted-datasets.gbif.org/datasets/backbone/README.html

That isn't necessarily the case for iNaturalist's API — as it is data on their side that needs to be updated, I will ask them how that should be done.

bdagley commented 4 months ago

Hi. This plan sounds good, please keep me updated.