gbif / occurrence-annotation

Experimental: Rule based annotation store
Apache License 2.0
0 stars 1 forks source link

controlled vocabulary discussion #16

Closed jhnwllr closed 1 year ago

jhnwllr commented 1 year ago

I want to tentatively propose the following vocabulary.

definition

annotatedRange - a controlled vocabulary describing areas on the earth in relation to a organism or higher taxonomy.

annotatedRange controlled vocabulary

examples

concept example
native extant
native endemic
native indigenous
native breeding
native non-breeding
introduced assisted colonization
introduced invasive
introduced non native range
managed location is captive range
managed location is botanical garden
managed location is zoo
managed cultivated in glasshouse
suspicious location is in the ocean
suspicious zero-zero coordinate
suspicious centroid
suspicious area too far north for taxon
suspicious area too high elevation for taxon
suspicious area is natural history museum
former fossil range
former extinct
former historic
vagrant migrant

other (free text)

I think users might find a free text other box useful. Minimally, it might capture missing vocabulary.

wrong but plausible location issue

Since we are currently not annotating gbifIds but ranges (polygons), occurrences that are incorrect for some reason but occur in a plausible location, will be problematic for this method.

suspicious

I like the concept of suspicious capturing the idea "the occurrences here are unlikely" because it doesn't require the user to diagnose why occurrences are in error because occurrences in a certain area might be suspicious for multiple (possibly unknown) reasons.

discussion

This vocabulary is describing an area on the earth not necessarily the occurrences that happen be within the polygon. The occurrences within this polygon are likely to be "managed", "native", "suspicious" ... but there is always some uncertainty, just like a zero-zero coordinate is likely to be "suspicious" but could also be genuine.

timrobertson100 commented 1 year ago

Thanks @jhnwllr

Merging the error/enriched into a single annotation has an additional benefit in that a known species range can be loaded in easily, rather than inverted and turned into an error statement.

I think users might find a free text other box useful

Would the existing commentary field satisfy this? The UI can present a text box next to "other", but it'd just become the first comment in the discussion. Would that be a reasonable approach?

jhnwllr commented 1 year ago

If users can query comments then I think it will satisfy the "other box" requirement.

give me all annotations with comment="my study area"

jhnwllr commented 1 year ago

@timrobertson100 @MortenHofft

Here is an example using annotatedRange with the controlled vocab. https://jhnwllr.github.io/acacia-dealbata-example-range-annotations/

Acacia dealbata Link https://www.gbif.org/species/2979474

Comments are shown in a popup.

some highlights image

image

image

acaciamulga commented 1 year ago

thanks, this is very interesting, with known invasive species there is usually a well-definable native range like you polygoned for Acacia dealbata, would it be possible draw the native range polygon and then do an inverse and select the rest of the world as a single polygon as introduced?

It could be that some of the putative are not introduced but are suspicious due to glasshouse, botanic garden or mis ID but it would be quick way to define native ranges.

John as we discussed then people with a particular interest could come in and fine tune these ranges. For example we could do rough native/introduces GRIIS ranges and then experts come in and delineate better. For example the "true" native range of Acacia dealbata is a subset of the occurrences in SE Australia as some of those are invasive within Australia.

It is tricky but interesting and important questions. Thanks

jhnwllr commented 1 year ago

@acaciamulga I have thought about the inverse distribution, but it doesn't really add that much information since if that is what the user wants they can easily just recode all occurrences that are outside the native range.

Additionally, as we see with the example of the glasshouse in Florida, not everything outside of the native range is going to be "introduced". In general the bigger you make your polygon annotations, the more likely you are going to run into edge cases that don't apply.

timrobertson100 commented 1 year ago

This has been implemented and deployed to labs. errorType and enrichmentType are replaced by a single annotation which currently holds the vocabulary outlined above.

jhnwllr commented 1 year ago

@MortenHofft I added the annotations from the examples above to the backend. http://labs.gbif.org:7013/v1/occurrence/annotation/rule/