elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.69k stars 24.66k forks source link

SQL: Cannot use HAVING clause with aggregates and fields #71350

Open costin opened 3 years ago

costin commented 3 years ago

Currently the HAVING clause translation takes into account only aggregations - however when dealing with subqueries, fields can be used as well:

SELECT * FROM (
  SELECT languages, COUNT(*) AS c FROM test_emp 
)
WHERE c > 0 OR languages IS NOT NULL

Due to the disjunction languages > 0 cannot be pushed down and the execution will fail due to the generated script being incorrect:

"source" : "InternalQlScriptUtils.nullSafeFilter(InternalQlScriptUtils.or(InternalQlScriptUtils.nullSafeFilter(InternalSqlScriptUtils.gt(params.a0,params.v0)),InternalQlScriptUtils.nullSafeFilter(InternalQlScriptUtils.isNotNull(InternalQlScriptUtils.docValue(doc,params.v2)))",
elasticmachine commented 3 years ago

Pinging @elastic/es-ql (Team:QL)

Luegg commented 3 years ago

I don't see an easy way to support this atm. As mentioned in https://github.com/elastic/elasticsearch/issues/36853 and https://github.com/elastic/elasticsearch/issues/32692 the script cannot access the bucket key from the composite aggregation.

(currently generated query for the SQL SELECT * FROM (SELECT COUNT(*) c, languages FROM test_emp GROUP BY languages) WHERE c > 0 OR languages IS NOT NULL:

{
  "size": 0,
  "_source": false,
  "aggregations": {
    "groupby": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "8831d512": {
              "terms": {
                "field": "languages",
                "missing_bucket": true,
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggregations": {
        "having.having.f143ef6a_|_having.3cf5abc8": {
          "bucket_selector": {
            "buckets_path": {
              "a0": "_count"
            },
            "script": {
              "source": "InternalQlScriptUtils.nullSafeFilter(InternalQlScriptUtils.or(InternalQlScriptUtils.nullSafeFilter(InternalQlScriptUtils.gt(params.a0,params.v0)),InternalQlScriptUtils.nullSafeFilter(InternalQlScriptUtils.isNotNull(InternalQlScriptUtils.docValue(doc,params.v1)))))",
              "lang": "painless",
              "params": {
                "v0": 0,
                "v1": "languages"
              }
            },
            "gap_policy": "skip"
          }
        }
      }
    }
  }
}

)

elasticsearchmachine commented 8 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)