gbif / gbif-api

GBIF API
Apache License 2.0
27 stars 5 forks source link

question: difference between `in` vs `or` predicates in download API queries #93

Closed abubelinha closed 2 years ago

abubelinha commented 2 years ago

As far as I understand, these two queries would produce the same output?

"predicate": {
    "type": "in",
    "key": "GADM_GID",
    "values": ["ESP", "AND", "PRT"]
}
"predicate": {
    "type": "or",
    "predicates": [
        {
          "type": "equals",
          "key": "GADM_GID",
          "value": "ESP",
          "matchCase": false
        },
        {
          "type": "equals",
          "key": "GADM_GID",
          "value": "AND",
          "matchCase": false
        },
        {
          "type": "equals",
          "key": "GADM_GID",
          "value": "PRT",
          "matchCase": false
        }
    ]
}

First construction with in is much simpler, and the examples in API documentation for Occurrence Download Predicates use it [link to new version]. So I wonder why when using data portal web interface to construct a download, and then clicking the API option to see how the json request looks like, it always uses the or version instead of using in.

  1. Is the download file construction slower in one case than in the other?

  2. Also, is the "matchCase": false also valid for using with "type:"in" predicates? (I didn't see it in the doc examples ... is it documented elsewhere?)

Thanks

MattBlissett commented 2 years ago

The two queries are equivalent. The "in" version is so much faster that an "or" query is replaced with the "in" query internally, if possible.

There's no particular reason the portal always uses "or" predicates.

I've documented the matchCase argument, it will appear with the next significant change to the portal. It can be used with "equals", "in" and "like" predicates.