biothings / biothings.api

BioThings API framework - Making high-performance API for biological annotation data
https://biothings.io
Apache License 2.0
45 stars 25 forks source link

Support post_filter query parameter #208

Closed newgene closed 1 year ago

newgene commented 2 years ago

Typically working together with aggregation/facets, post_filter allows to filter hits list without changing total and aggregation output:

https://www.elastic.co/guide/en/elasticsearch/reference/8.1/filter-search-results.html#post-filter

For example:

{
  "query": {
    "query_string": {
      "query": "_exists_:@type",
      "default_operator": "AND",
      "lenient": true
    }
  },
  "aggs": {
    "@type": {
      "terms": {
        "field": "@type",
        "size": 100
      }
    }
  },
  "post_filter": {
    "term": {
      "@type": "Dataset"
    }
  },
  "size": 10
}

The new post_filter can be used to pass a query_string query to filter the matching hits:

query?q=_exists_:@type&post_filter=@type:Dataset
colleenXu commented 2 years ago

This looks interesting for BTE use (querying biothings apis), particularly since it looks like it would allow checking that a field exists AND that its value is X.

newgene commented 1 year ago

We already have an implementation at a particular API instance:

https://github.com/data2health/resource-discovery-api/blob/a7bdc531dde35511ca9ddcb252b58623693adfe3/web/pipeline.py#L73-L77

Now we should move this feature into biothings.web module to make it a generic query feature for all APIs.

colleenXu commented 1 year ago

will this work for the POST /query endpoint? As mentioned above, I'm interested in being able to check that a field exists while also querying for certain values in that field and other fields.