gbif / gbif-api

GBIF API
Apache License 2.0
27 stars 5 forks source link

Explore wildcard, nullability and negation checks in occurrence search #89

Open timrobertson100 opened 2 years ago

timrobertson100 commented 2 years ago

The GraphQL API powering the hosted portals is able to support null, negation, and wildcard searches for occurrence search (currently enabled for a few fields only, but more will follow).

It would be a good addition to the REST API to support these allowing for URL hacking and URL consistency with the future of GBIF.org (see https://github.com/gbif/hosted-portals/issues/209).

While the motivation applies to occurrence search, it may be of interest to other content types surfaced in REST APIs. These could be added progressively (e.g. returning some 4** code if detected and not supported in the meantime).

MortenHofft commented 2 years ago

So far gbif-web project has been using

But a more explicit syntax might be preferable and more flexible. issue_$exists, issue_$not=ZERO_COORDINATE, year_$gte=1900, recordedBy_$like=*humboldt

When negations are present I would interpret those as AND. unlike what we do normally where it is an OR issue_$not=A&issue_$not=B&issue=C&issue=D E.g.

{
  "type": "and",
    "predicates": [
      {
        "type": "not",
        "predicate": {
          "type": "in", // the occurrence must have neither A nor B. - (NOT A) and (NOT B)
          "key": "issue",
          "values": [
            "A",
            "B"
          ]
        }
      },
      {
        "type": "in", // the occurrence has either issue C or D - C or D
        "key": "issue",
        "values": [
          "C",
          "D"
        ]
      }
    ]
}
abubelinha commented 1 year ago

The GraphQL API powering the hosted portals is able to support null, negation, and wildcard searches for occurrence search (currently enabled for a few fields only, but more will follow).

@timrobertson100 is there a list of which fields support those searches as of today?

in https://www.gbif.org/developer/occurrence I can see this example of logical negation, but nothing else is said:

{
  "creator":"userName",
  "notificationAddresses": ["userName@example.org"],
  "predicate":
  {
    "type":"not",
    "predicate":
    {
      "type":"equals",
      "key":"DATASET_KEY",
      "value":"4fa7b334-ce0d-4e88-aaae-2e0c138d049e"
    }
  }
}

Thanks @abubelinha

timrobertson100 commented 1 year ago

@MortenHofft - can you answer that please so I don't give incorrect info?

abubelinha commented 1 year ago

BTW regarding wildcards I guess you mean the like predicate (I don't see the word wildcard in occurrence api reference)

like
search for a pattern, ? matches one character, * matches zero or more characters

{
"creator":"userName",
"notificationAddresses": ["userName@example.org"],
"predicate":
{
"type":"like",
"key":"CATALOG_NUMBER",
"value":"PAPS5-560*"
}
}
matchCase can be added if required.
MortenHofft commented 1 year ago

hi @abubelinha I'm not sure I understand the question exactly, but...

Predicates The API v1 predicates allow adding not around any other predicate. e.g.

{
  "creator":"userName",
  "notificationAddresses": ["userName@example.org"],
  "predicate":
  {
    "type":"not",
    "predicate":
    {
      "type":"or",
      "predicates": [
             {
              "type":"equals",
              "key":"TAXON_KEY",
              "value": 5
            },
            {
              "type":"like",
              "key":"CATALOG_NUMBER",
              "value":"PAPS5-560*"
            }
       ]
    }
  }
}

Search using predicates The graphql API was created for multiple purposes, but one of them to allow search using predicates (the type the GBIF API has been using to specify downloads).

Since then that has been extended to the standard API albeit not yet documented. https://github.com/gbif/portal16/issues/1778 So it is now possible to search with a predicate as the filter.

MortenHofft commented 1 year ago

And yes with wildcard it is the predicate of type like with the characters * and ?. They work on text fields. So not on e.g. taxonKey

MortenHofft commented 1 year ago

a list of which fields support those searches as of today?

The fields that one can use for search is these: https://www.gbif.org/developer/occurrence#parameters

I do not believe we have any documentation about which fields that allow e.g. like queries (for example country (iso code) does not). But anything text that isn't a controlled vocabulary should support it