gbif / occurrence-annotation

Experimental: Rule based annotation store
Apache License 2.0
0 stars 1 forks source link

add support for custom JSON predicate annotations #40

Open jhnwllr opened 3 months ago

jhnwllr commented 3 months ago

GBIF has a powerful domain specific language (dsl) for describing selections of occurrences, the predicate functions. https://techdocs.gbif.org/en/openapi/v1/occurrence#/Occurrence%20downloads

During the initial phase of experimentation, I think it would be helpful to allow users to send us a raw JSON predicate string describing the occurrences they are selecting. This would allow users to capture complex rules without us having to re-invent the wheel.

The main columns taxon_key, geometry, annotation would still be required.

Example

Complex Rule: Lions in North America that are not fossils or living specimens are Suspicious.

taxon_key : 5219404 geometry : WKT of North America&occurrence_status=present) annotation : SUSPICIOUS

raw_json_pred :

 "predicate": {
  "type": "not",
  "predicate": {
   "type": "in",
   "key": "BASIS_OF_RECORD",
   "values": [
    "FOSSIL_SPECIMEN",
    "LIVING_SPECIMEN"
   ]
  }
 }

backend

R package

rgbif has support for writing JSON predicates easily via occ_download_prep(). This code could be adapted for gbifan.

UI

For now, the UI could show the annotation (suspicious, native, ect), but ignore the raw_json_pred until we have a way to represent it in a acceptable way.

Issues

Since some of these columns have analogues in the predicate dsl, so it is possible to create rules that have conflicting values for taxon_key and geometry. However, this also allows users to express rules with, for example, multiple taxa selections that are not yet possible in the current framework.