elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.6k stars 24.63k forks source link

SQL: HAVING with functions that are not covered by regular metric aggregations in ES doesn't work #33519

Open astefan opened 6 years ago

astefan commented 6 years ago
sql> select avg("salary"), day("hire_date") from "test_emp" group by day("hire_date") having day("hire_date") = 1;
Bad request [Found 1 problem(s)
line 1:89: Cannot filter HAVING on non-aggregate [hire_date]; consider using WHERE instead]

A query like this one is normally translated into ES DSL more or less like this:

{
    "size": 0,
    "_source": false,
    "stored_fields": "_none_",
    "aggregations": {
        "groupby": {
            "composite": {
                "size": 1000,
                "sources": [
                    {
                        "493": {
                            "terms": {
                                "script": {
                                    "source": "InternalSqlScriptUtils.dateTimeChrono(doc[params.v0].value.millis, params.v1, params.v2)",
                                    "lang": "painless",
                                    "params": {
                                        "v0": "hire_date",
                                        "v1": "UTC",
                                        "v2": "DAY_OF_MONTH"
                                    }
                                },
                                "missing_bucket": true,
                                "value_type": "number",
                                "order": "asc"
                            }
                        }
                    }
                ]
            },
            "aggregations": {
                "494": {
                    "avg": {                              <------------
                        "field": "salary"              <------------ the buckets path used in the condition
                    }                                        <------------
                },
                "497": {
                    "bucket_selector": {
                        "buckets_path": {
                            "a0": "494"
                        },
                        "script": {
                            "source": "params.a0 = params.v0",
                            "lang": "painless",
                            "params": {
                                "v0": 1
                            }
                        },
                        "gap_policy": "skip"
                    }
                }
            }
        }
    }
}

While ES does offer avg, min, max etc metric aggregations, we would need the key of parent aggregation (the composite one) for the bucket_selector script, which I am not sure it's possible.

elasticmachine commented 6 years ago

Pinging @elastic/es-search-aggs

costin commented 6 years ago

I think the error is correct; HAVING is used for aggregates only not fields (whether grouped or not). Converting HAVING to WHERE should fix the error and the query should work.

astefan commented 6 years ago

That's fair @costin. How about the following use case?

sql> select dayname("hire_date") from "test_emp" group by dayname("hire_date") having max(emp_no) > ASCII(dayname("hire_date"));
Bad request [Found 1 problem(s)
line 1:82: Cannot filter HAVING on non-aggregate [hire_date]; consider using WHERE instead]
matriv commented 4 years ago

@elastic/es-ql

elasticsearchmachine commented 8 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)