apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.49k stars 3.7k forks source link

quantiles sketch post aggregator documentation confusing #5935

Closed pdeva closed 4 years ago

pdeva commented 6 years ago

Druid 0.12.1 Following documentation in: http://druid.io/docs/latest/development/extensions-core/datasketches-quantiles.html

Query:

{
  "queryType": "timeseries",
  "dataSource": "pctile",
  "intervals": "2018-07-04T17:23:00.000Z/2018-07-04T17:53:00.000Z",
  "granularity": {
    "type": "period",
    "period": "PT1M",
    "origin": "2018-07-04T17:23:00.000Z"
  },
  "aggregations": [
    {
      "type": "quantilesDoublesSketch",
      "fieldName": "value",
      "name": "value",
      "k": 512
    }
  ],
  "postAggregations": [
    {
      "type": "quantilesDoublesSketchToQuantiles",
      "name": "qtl",
      "field": "value",
      "fractions": [
        0.5,
        0.9,
        0.95,
        0.99
      ]
    }
  ],
  "filter": {
    "type": "and",
    "fields": [
      {
        "type": "selector",
        "dimension": "accountid",
        "value": "xxx"
      }
    ]
  }
}

Returns 500 with body:

{
    "error": "Unknown exception",
    "errorMessage": "Unexpected token (VALUE_STRING), expected FIELD_NAME: missing property 'type' that is to contain type id  (for class io.druid.query.aggregation.PostAggregator)\n at [Source: HttpInputOverHTTP@2af18c94[c=900,q=1,[0]=EOF,s=STREAM]; line: 21, column: 20] (through reference chain: java.util.ArrayList[0])",
    "errorClass": "com.fasterxml.jackson.databind.JsonMappingException",
    "host": null
}

Full exception on broker:

2018-07-04 18:19:27,813 ERROR i.d.s.QueryResource [qtp993208674-124] Exception handling request: {class=io.druid.server.QueryResource, exceptionType=class com.fasterxml.jackson.databind.JsonMappingException, exceptionMessage=Unexpected token (VALUE_STRING), expected FIELD_NAME: missing property 'type' that is to contain type id  (for class io.druid.query.aggregation.PostAggregator)
 at [Source: HttpInputOverHTTP@2af18c94[c=824,q=1,[0]=EOF,s=STREAM]; line: 21, column: 20] (through reference chain: java.util.ArrayList[0]), exception=com.fasterxml.jackson.databind.JsonMappingException: Unexpected token (VALUE_STRING), expected FIELD_NAME: missing property 'type' that is to contain type id  (for class io.druid.query.aggregation.PostAggregator)
 at [Source: HttpInputOverHTTP@2af18c94[c=824,q=1,[0]=EOF,s=STREAM]; line: 21, column: 20] (through reference chain: java.util.ArrayList[0]), query=unparseable query, peer=98.210.189.103}
com.fasterxml.jackson.databind.JsonMappingException: Unexpected token (VALUE_STRING), expected FIELD_NAME: missing property 'type' that is to contain type id  (for class io.druid.query.aggregation.PostAggregator)
 at [Source: HttpInputOverHTTP@2af18c94[c=824,q=1,[0]=EOF,s=STREAM]; line: 21, column: 20] (through reference chain: java.util.ArrayList[0])
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.DeserializationContext.wrongTokenException(DeserializationContext.java:854) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedUsingDefaultImpl(AsPropertyTypeDeserializer.java:140) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:75) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:536) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:344) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1064) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:264) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:156) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:126) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:84) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:234) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:206) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:25) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:538) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:344) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1064) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:264) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:156) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:126) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:113) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:84) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:132) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:41) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3066) ~[jackson-databind-2.4.6.jar:2.4.6]
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2207) ~[jackson-databind-2.4.6.jar:2.4.6]
    at io.druid.server.QueryResource.readQuery(QueryResource.java:303) ~[druid-server-0.12.1.jar:0.12.1]
    at io.druid.server.QueryResource.doPost(QueryResource.java:169) [druid-server-0.12.1.jar:0.12.1]
    at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
    at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) [jersey-server-1.19.3.jar:1.19.3]
    at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) [jersey-servlet-1.19.3.jar:1.19.3]
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) [jersey-servlet-1.19.3.jar:1.19.3]
    at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) [jersey-servlet-1.19.3.jar:1.19.3]
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) [javax.servlet-api-3.1.0.jar:3.1.0]
    at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286) [guice-servlet-4.1.0.jar:?]
    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276) [guice-servlet-4.1.0.jar:?]
    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181) [guice-servlet-4.1.0.jar:?]
    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) [guice-servlet-4.1.0.jar:?]
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) [guice-servlet-4.1.0.jar:?]
    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120) [guice-servlet-4.1.0.jar:?]
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135) [guice-servlet-4.1.0.jar:?]
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at io.druid.server.security.PreResponseAuthorizationCheckFilter.doFilter(PreResponseAuthorizationCheckFilter.java:84) [druid-server-0.12.1.jar:0.12.1]
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at io.druid.server.security.AllowOptionsResourceFilter.doFilter(AllowOptionsResourceFilter.java:76) [druid-server-0.12.1.jar:0.12.1]
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at io.druid.server.security.AllowAllAuthenticator$1.doFilter(AllowAllAuthenticator.java:85) [druid-server-0.12.1.jar:0.12.1]
    at io.druid.server.security.AuthenticationWrappingFilter.doFilter(AuthenticationWrappingFilter.java:60) [druid-server-0.12.1.jar:0.12.1]
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at io.druid.server.security.SecuritySanityCheckFilter.doFilter(SecuritySanityCheckFilter.java:86) [druid-server-0.12.1.jar:0.12.1]
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-servlet-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.Server.handle(Server.java:534) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [jetty-server-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [jetty-io-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.19.v20170502.jar:9.3.19.v20170502]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]

The extension is indeed loaded. If i remove the postAgg entry, the query executes (with the quantilesDoublesSketch aggregation).

Is something broken with the post aggregator of quantiles sketch?

What is even more confusing is if i remove the line "field": "value", from the postAgg declaration, I get this error msg in response:

{
    "error": "Unknown exception",
    "errorMessage": "Instantiation of [simple type, class io.druid.query.aggregation.datasketches.quantiles.DoublesSketchToQuantilesPostAggregator] value failed: field is null (through reference chain: java.util.ArrayList[0])",
    "errorClass": "com.fasterxml.jackson.databind.JsonMappingException",
    "host": null
}

So it seems druid does know about the post aggregator.

This is very confusing....

pdeva commented 6 years ago

ahh i notice, that the post agg requires a field type of fieldaccess, not a string. using this works:

 "field": {
        "type": "fieldAccess",
         "fieldName": "value"
      }

That said, I do think the error message can be improved a lot, so i am keeping this open for now

danc commented 5 years ago

Hi, Btw the mentioned documentation is also very confusing, because it mentions the following

  "field"  : <post aggregator that refers to a DoublesSketch (fieldAccess or another post aggregator)>

while it seems, by your example than you refer an aggregator (and not a post aggregator), i.e. "value" in your example. Am I interpreting right ?

It is very confusing to use the same name for the original metric and the aggregator using it, making impossible to read most examples (including the few ones in the docs) without ambiguity.

At this time, I have not been able to use quantilesDoublesSketchToQuantiles to compute medians for example :

"aggregations":[  
 {
         "type" : "quantilesDoublesSketch",
        "name" : "datasketch_duration",
        "fieldName" : "duration",
        "k": 256
        }
]
....
 "postAggregations":[
{
          "type"  : "quantilesDoublesSketchToQuantile",
          "name": "medianDurationDataSketches",
          "field"  :  {     
                     "fieldName":"datasketch_duration",
                     "type":"fieldAccess"
                  },
          "fraction" : 0.5
        }
]
stale[bot] commented 4 years ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

pdeva commented 4 years ago

bump

stale[bot] commented 4 years ago

This issue is no longer marked as stale.

stale[bot] commented 4 years ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.