elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
68.56k stars 24.35k forks source link

SQL: Matrix aggregrations cannot be used for filtering #76344

Open costin opened 2 years ago

costin commented 2 years ago

Matrix aggs (such as Kurtosis or Skweness) can be used inside projections but not for filtering:

SELECT KURTOSIS("field") FROM table  // works
SELECT KURTOSIS("field") FROM table HAVING KURTOSIS("field") > 0 // fails

The second query generates the following script:

     "aggregations" : {
        "bce0f6b8" : {
          "matrix_stats" : {
            "fields" : [
              "field"
            ],
            "mode" : "AVG"
          }
        },
        "having.f2bcecde" : {
          "bucket_selector" : {
            "buckets_path" : {
              "a0" : "bce0f6b8.kurtosis"
            },
            "script" : {
              "source" : "InternalQlScriptUtils.nullSafeFilter(InternalQlScriptUtils.gt(params.a0,params.v0))",
              "lang" : "painless",
              "params" : {
                "v0" : 0
              }
            },
            "gap_policy" : "skip"
          }
        }
      }

which fails (in main) with :

Caused by: org.elasticsearch.search.aggregations.AggregationExecutionException: buckets_path must reference either a number value or a single value numeric metric aggregation, got: [UnmodifiableMap] at aggregation [bce0f6b8]
        at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.formatResolutionError(BucketHelpers.java:232) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.resolveBucketValue(BucketHelpers.java:196) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.search.aggregations.pipeline.BucketHelpers.resolveBucketValue(BucketHelpers.java:178) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.search.aggregations.pipeline.BucketSelectorPipelineAggregator.reduce(BucketSelectorPipelineAggregator.java:56) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.search.aggregations.InternalAggregation.reducePipelines(InternalAggregation.java:205) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.search.aggregations.InternalMultiBucketAggregation.reducePipelines(InternalMultiBucketAggregation.java:150) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.search.aggregations.InternalAggregations.lambda$topLevelReduce$2(InternalAggregations.java:111) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) ~[?:?]
        at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) ~[?:?]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[?:?]
        at org.elasticsearch.search.aggregations.InternalAggregations.topLevelReduce(InternalAggregations.java:112) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:480) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:468) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.search.QueryPhaseResultConsumer.reduce(QueryPhaseResultConsumer.java:131) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:98) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:84) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:737) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]

Checked the specification tests and we fail to account for this case (likely assuming the behavior is the same across project and select). In case of matrix aggs, the bucket selector needs to extract also the field information (like in the projection) not just call the method.

elasticmachine commented 2 years ago

Pinging @elastic/es-ql (Team:QL)

astefan commented 2 years ago

Blocked by https://github.com/elastic/elasticsearch/issues/87454

elasticsearchmachine commented 5 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)