Closed jhnwllr closed 1 year ago
Thanks, @jhnwllr - I'm inclined to suggest we do implement something that allows for option 2. @MortenHofft already pondered whether we should just have a flexible scope
where any combination of filter was possible. If it is too flexible it becomes difficult to consume (i.e. what rules apply for <this data download>
?), but having e.g. a datasetScope
and taxonScope
(possibly more like temporalScope
in the future) would be intuitive.
An alternative could be to create rules under the existing functionality using the scope
of a dataset. However, those rules would be of the form dataset A + GEOM = "suspicious"
which would flag records of any taxon, including genuine in-situ sightings. The motivation there was mainly to handle truly bogus coordinates (e.g. middle of the sea points against terrestrial-only data) where attempts to fix things at source failed. I suspect that is still useful in some cases.
I have changed the API by removing the current single-context approach (i.e. contextKey
and contextType
gone) and replaced it with the simpler taxonKey
and datasetKey
allowing for either or both to be provided.
Finding rules can then still be done by querying uring taxonKey
or datasetKey
, and taxon-based rules can be additionally scoped by the dataset when creating them.
@jhnwllr - please can you update your scripts, noting that taxonKey
is an integer, not a string (see the README for an example)?
I have been testing how well the current vocabulary and rule structure works on example taxonKeys.
One weakness is that we have no way to handle when occurrences are in a plausible location but still suspicious.
I see two solutions to this scenario (without involving gbifIds).
datasetKey
+taxonKey
+Polygon
-> "suspicious"I am undecided whether this would be worth the added complexity.
Example I made with Lions. Occurrences in the yellow box are all from the same dataset and claim to be in the big national park above.
https://jhnwllr.github.io/panthera-leo-example-range-annotations/