apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.43k stars 3.69k forks source link

Intermittent NullPointerExceptions causing Task Failure #9842

Closed marcusgreer closed 1 year ago

marcusgreer commented 4 years ago

Druid throws a Nullpointer exception when processing a GroupBy query

Affected Version

0.17.0

Description

2020-04-15T16:13:02,293 ERROR [processing-0] org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2 - Exception with one of the sequences!
java.lang.NullPointerException: null
    at org.apache.druid.segment.realtime.FireHydrant.getIncrementedSegment(FireHydrant.java:78) ~[druid-server-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.segment.realtime.FireHydrant.getAndIncrementSegment(FireHydrant.java:138) ~[druid-server-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.segment.realtime.appenderator.SinkQuerySegmentWalker.lambda$null$0(SinkQuerySegmentWalker.java:212) ~[druid-server-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at com.google.common.collect.Iterators$8.transform(Iterators.java:794) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) ~[guava-16.0.1.jar:?]
    at org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:56) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:46) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Iterators.addAll(Iterators.java:356) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Lists.newArrayList(Lists.java:147) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Lists.newArrayList(Lists.java:129) ~[guava-16.0.1.jar:?]
    at org.apache.druid.query.ChainedExecutionQueryRunner$1.make(ChainedExecutionQueryRunner.java:104) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:39) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner$1.accumulate(SpecificSegmentQueryRunner.java:79) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:163) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner.access$100(SpecificSegmentQueryRunner.java:42) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner$2.wrap(SpecificSegmentQueryRunner.java:145) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:246) [druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:233) [druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
    at org.apache.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:247) [druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
jihoonson commented 4 years ago

Interesting. AFAIT, this shouldn't happen if the task successfully handed off all created segments. Also, this NPE doesn't do anything with the task status. It should be just a query failure. I suspect the task failed to hand off segments for some reason which made the task failed. Do you see anything interesting in the task logs?

marcusgreer commented 4 years ago

@jihoonson I don't see anything else out of the ordinary, but the problem seems to be ongoing. Is there any additional logging that I should turn on? or metrics that we should set up monitors for to drill down into this issue?

jihoonson commented 4 years ago

Hmm, I'm not sure. Can you share the task logs if possible?

marcusgreer commented 4 years ago

Sorry for the delay. Yes, here are the logs surrounding the NPE, I can also pass along more information from the full logs if needed

2020-04-15T16:12:59,438 INFO [qtp930569137-80] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:12:59.412Z  100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:58.000Z/2020-04-15T16:11:58.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:58.000Z/2020-04-15T16:11:58.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}},{"dimension":"metric.label.cluster_id","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":158730,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967239334,"queryId":"982dc814-53ce-485f-831c-04646d649017","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false} {"query/time":25,"query/bytes":1265,"success":true,"identity":"druid_system"}
2020-04-15T16:12:59,496 INFO [qtp930569137-72] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:12:59.489Z  100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":149253,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967239439,"queryId":"9d5703d3-1b3c-4d26-88e0-f4d4fc2261b8","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false}    {"query/time":6,"query/bytes":6,"success":true,"identity":"druid_system"}
2020-04-15T16:12:59,848 INFO [qtp930569137-80] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:12:59.841Z  100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:59.000Z/2020-04-15T16:11:59.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:59.000Z/2020-04-15T16:11:59.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_in_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}},{"dimension":"metric.label.cluster_id","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":147058,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967239787,"queryId":"be8ed9fc-7d26-404c-adee-ff3610c68b53","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false}  {"query/time":6,"query/bytes":48,"success":true,"identity":"druid_system"}
2020-04-15T16:13:00,044 INFO [qtp930569137-72] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:00.033Z  100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:59.000Z/2020-04-15T16:11:59.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:59.000Z/2020-04-15T16:11:59.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}},{"dimension":"metric.label.cluster_id","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":158730,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967239967,"queryId":"bf716a2c-b2f6-4a6f-b866-dbc09d67cecb","resultAsArray":true,"sortByDimsFirst":true,"timeout":59997},"descending":false} {"query/time":10,"query/bytes":6,"success":true,"identity":"druid_system"}
2020-04-15T16:13:00,071 INFO [qtp930569137-80] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:00.070Z  100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:59.000Z/2020-04-15T16:11:59.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:59.000Z/2020-04-15T16:11:59.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/active_authenticated_connection_count","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"},{"type":"default","dimension":"broker_id","outputName":"_broker_id","outputType":"STRING"},{"type":"default","dimension":"user","outputName":"_user","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"_sum","fieldName":"value_sum","expression":null},{"type":"doubleSum","name":"_count","fieldName":"count","expression":null}],"postAggregations":[{"type":"arithmetic","name":"value_mean","fn":"/","fields":[{"type":"fieldAccess","name":null,"fieldName":"_sum"},{"type":"fieldAccess","name":null,"fieldName":"_count"}],"ordering":null}],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":151515,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967240006,"queryId":"d6f55df1-508d-4776-8a44-5e1339f30d14","resultAsArray":true,"sortByDimsFirst":false,"sortResults":false,"timeout":59996},"descending":false}    {"query/time":1,"query/bytes":56,"success":true,"identity":"druid_system"}
2020-04-15T16:13:00,116 INFO [qtp930569137-116] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:00.112Z 100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/log/size","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"},{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"partition","outputName":"_partition","outputType":"STRING"}],"aggregations":[{"type":"doubleMax","name":"value_max","fieldName":"value_max","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":147058,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967240059,"queryId":"e1abd63e-b6a4-477a-9d78-18e2fb9b28af","resultAsArray":true,"sortByDimsFirst":false,"sortResults":false,"timeout":59996},"descending":false}    {"query/time":3,"query/bytes":3953,"success":true,"identity":"druid_system"}
2020-04-15T16:13:00,987 INFO [[index_kafka_telemetry_prod_61587c805b44382_blimaiia]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Flushed in-memory data for segment[telemetry_prod_2020-04-15T16:00:00.000Z_2020-04-15T16:15:00.000Z_2020-04-15T16:00:00.074Z_76] spill[2] to disk in [10,103] ms (549,899 rows).
2020-04-15T16:13:01,072 INFO [qtp930569137-72] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:01.037Z  100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:03:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:03:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null},{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null}]},"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"topic","outputType":"STRING"},{"type":"default","dimension":"partition","outputName":"partition","outputType":"STRING"},{"type":"default","dimension":"client_id","outputName":"client_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"value","direction":"descending","dimensionOrder":{"type":"numeric"}}],"limit":24000},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"fudgeTimestamp":"1586966580000","groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":144927,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967240990,"queryId":"e887aa30-1c57-43fe-9380-9db5780fc1f6","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false}  {"query/time":34,"query/bytes":163,"success":true,"identity":"druid_system"}
2020-04-15T16:13:01,200 INFO [qtp930569137-116] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:01.173Z 100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:03:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:03:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null},{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null}]},"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"topic","outputType":"STRING"},{"type":"default","dimension":"client_id","outputName":"client_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"value","direction":"descending","dimensionOrder":{"type":"numeric"}}],"limit":24000},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"fudgeTimestamp":"1586966580000","groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":144927,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241070,"queryId":"914b8122-e63c-4eaa-8989-0728df4fb61b","resultAsArray":true,"sortByDimsFirst":true,"timeout":59995},"descending":false}    {"query/time":27,"query/bytes":89,"success":true,"identity":"druid_system"}
2020-04-15T16:13:01,271 INFO [qtp930569137-72] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:01.202Z  100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:03:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:03:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null},{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_in_total/delta","extractionFn":null}]},"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"topic","outputType":"STRING"},{"type":"default","dimension":"partition","outputName":"partition","outputType":"STRING"},{"type":"default","dimension":"client_id","outputName":"client_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"value","direction":"descending","dimensionOrder":{"type":"numeric"}}],"limit":24000},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"fudgeTimestamp":"1586966580000","groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":147058,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241163,"queryId":"f4da5557-3beb-4059-80ec-d8302cfb9e96","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false}   {"query/time":68,"query/bytes":6,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,045 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Segment Handoff complete for dataSource[telemetry_prod] Segment[SegmentDescriptor{interval=2020-04-15T16:00:00.000Z/2020-04-15T16:15:00.000Z, version='2020-04-15T16:00:00.074Z', partitionNumber=37}]
2020-04-15T16:13:02,150 INFO [qtp930569137-80] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.095Z  100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_in_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":140845,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241372,"queryId":"e19fe2bb-e505-41bb-989c-0d0e405c9da4","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false} {"query/time":55,"query/bytes":773,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,185 INFO [[index_kafka_telemetry_prod_61587c805b44382_blimaiia]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Flushed in-memory data with commit metadata [AppenderatorDriverMetadata{segments={index_kafka_telemetry_prod_61587c805b44382_7=[SegmentWithState{segmentIdentifier=telemetry_prod_2020-04-15T16:00:00.000Z_2020-04-15T16:15:00.000Z_2020-04-15T16:00:00.074Z_76, state=APPENDING}]}, lastSegmentIds={index_kafka_telemetry_prod_61587c805b44382_7=telemetry_prod_2020-04-15T16:00:00.000Z_2020-04-15T16:15:00.000Z_2020-04-15T16:00:00.074Z_76}, callerMetadata={nextPartitions=SeekableStreamEndSequenceNumbers{stream='_confluent-telemetry-metrics', partitionSequenceNumberMap={59=25582939725, 23=60664169986}}}}] for segments: telemetry_prod_2020-04-15T16:00:00.000Z_2020-04-15T16:15:00.000Z_2020-04-15T16:00:00.074Z_76
2020-04-15T16:13:02,193 INFO [[index_kafka_telemetry_prod_61587c805b44382_blimaiia]-appenderator-persist] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[telemetry_prod_2020-04-15T16:00:00.000Z_2020-04-15T16:15:00.000Z_2020-04-15T16:00:00.074Z_37] at path[/druid/prod/segments/druid-prod-middle-manager-33.druid-prod-middle-manager.druid-prod.svc.cluster.local:8100/druid-prod-middle-manager-33.druid-prod-middle-manager.druid-prod.svc.cluster.local:8100_indexer-executor__default_tier_2020-04-15T15:19:02.716Z_546e2e822ed54e71a59e7e8f917c86f90]
2020-04-15T16:13:02,193 INFO [qtp930569137-72] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.112Z  100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_in_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}},{"dimension":"metric.label.cluster_id","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":138888,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241888,"queryId":"86ac3569-fe25-444b-9cd1-77b37c9525fc","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false}  {"query/time":80,"query/bytes":181,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,273 INFO [[index_kafka_telemetry_prod_61587c805b44382_blimaiia]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Dropped segment[telemetry_prod_2020-04-15T16:00:00.000Z_2020-04-15T16:15:00.000Z_2020-04-15T16:00:00.074Z_37].
2020-04-15T16:13:02,290 INFO [qtp930569137-116] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.146Z 100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}},{"dimension":"metric.label.cluster_id","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":138888,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241791,"queryId":"6c3f1d48-a1a5-4f55-87ab-f570b1454259","resultAsArray":true,"sortByDimsFirst":true,"timeout":59996},"descending":false} {"query/time":144,"query/bytes":66,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,293 ERROR [processing-0] org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2 - Exception with one of the sequences!
java.lang.NullPointerException: null
    at org.apache.druid.segment.realtime.FireHydrant.getIncrementedSegment(FireHydrant.java:78) ~[druid-server-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.segment.realtime.FireHydrant.getAndIncrementSegment(FireHydrant.java:138) ~[druid-server-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.segment.realtime.appenderator.SinkQuerySegmentWalker.lambda$null$0(SinkQuerySegmentWalker.java:212) ~[druid-server-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at com.google.common.collect.Iterators$8.transform(Iterators.java:794) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) ~[guava-16.0.1.jar:?]
    at org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:56) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:46) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Iterators.addAll(Iterators.java:356) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Lists.newArrayList(Lists.java:147) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Lists.newArrayList(Lists.java:129) ~[guava-16.0.1.jar:?]
    at org.apache.druid.query.ChainedExecutionQueryRunner$1.make(ChainedExecutionQueryRunner.java:104) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:39) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner$1.accumulate(SpecificSegmentQueryRunner.java:79) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:163) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner.access$100(SpecificSegmentQueryRunner.java:42) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner$2.wrap(SpecificSegmentQueryRunner.java:145) ~[druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:246) [druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:233) [druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
    at org.apache.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:247) [druid-processing-0.17.0-confluent-dba5ee2.jar:0.17.0-confluent-dba5ee2]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
2020-04-15T16:13:02,298 INFO [qtp930569137-114[groupBy_[telemetry_prod]_998b2bca-07cb-4a07-b7d4-b8cac5ce47e0]] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.126Z  100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/active_authenticated_connection_count","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"},{"type":"default","dimension":"broker_id","outputName":"_broker_id","outputType":"STRING"},{"type":"default","dimension":"user","outputName":"_user","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"_sum","fieldName":"value_sum","expression":null},{"type":"doubleSum","name":"_count","fieldName":"count","expression":null}],"postAggregations":[{"type":"arithmetic","name":"value_mean","fn":"/","fields":[{"type":"fieldAccess","name":null,"fieldName":"_sum"},{"type":"fieldAccess","name":null,"fieldName":"_count"}],"ordering":null}],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":135135,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241550,"queryId":"998b2bca-07cb-4a07-b7d4-b8cac5ce47e0","resultAsArray":true,"sortByDimsFirst":false,"sortResults":false,"timeout":59994},"descending":false}    {"query/time":172,"query/bytes":-1,"success":false,"identity":"druid_system","exception":"java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException"}
2020-04-15T16:13:02,329 ERROR [qtp930569137-114[groupBy_[telemetry_prod]_998b2bca-07cb-4a07-b7d4-b8cac5ce47e0]] org.apache.druid.server.QueryResource - Exception handling request: {class=org.apache.druid.server.QueryResource, exceptionType=class java.lang.RuntimeException, exceptionMessage=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException, query={"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":37},{"itvl":"2020-04-15T16:10:57.000Z/2020-04-15T16:11:57.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/active_authenticated_connection_count","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"},{"type":"default","dimension":"broker_id","outputName":"_broker_id","outputType":"STRING"},{"type":"default","dimension":"user","outputName":"_user","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"_sum","fieldName":"value_sum","expression":null},{"type":"doubleSum","name":"_count","fieldName":"count","expression":null}],"postAggregations":[{"type":"arithmetic","name":"value_mean","fn":"/","fields":[{"type":"fieldAccess","name":null,"fieldName":"_sum"},{"type":"fieldAccess","name":null,"fieldName":"_count"}],"ordering":null}],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":135135,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967241550,"queryId":"998b2bca-07cb-4a07-b7d4-b8cac5ce47e0","resultAsArray":true,"sortByDimsFirst":false,"sortResults":false,"timeout":59994},"descending":false}, peer=100.96.188.16} (java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException)
2020-04-15T16:13:02,376 INFO [qtp930569137-130] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.171Z 100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:00:00.000Z/2020-04-15T16:13:59.999Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null},{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/partition_bytes_out_total/delta","extractionFn":null},{"type":"selector","dimension":"foobar_topic","value":"CUST_V360","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"foobar_topic","outputName":"topic","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"topic","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":5760000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":119047,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967242109,"queryId":"e33fca53-8b50-4784-91bf-908907fb5f14","resultAsArray":true,"sortByDimsFirst":true,"timeout":59990},"descending":false}    {"query/time":204,"query/bytes":37,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,418 INFO [qtp930569137-80] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.319Z  100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/request_total/delta","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"request","outputName":"metric.label.type","outputType":"STRING"},{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"value","fieldName":"value_sum","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"default","columns":[{"dimension":"metric.label.type","direction":"ascending","dimensionOrder":{"type":"lexicographic"}},{"dimension":"metric.label.cluster_id","direction":"ascending","dimensionOrder":{"type":"lexicographic"}}],"limit":1000},"context":{"applyLimitPushDown":true,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":138888,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967242242,"queryId":"11fb6b3b-86c7-4def-bcc1-f3c28d8829f8","resultAsArray":true,"sortByDimsFirst":true,"timeout":59995},"descending":false}    {"query/time":98,"query/bytes":261,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,449 INFO [qtp930569137-72] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.346Z  100.97.92.6 {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/log/size","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"},{"type":"default","dimension":"foobar_topic","outputName":"metric.label.topic","outputType":"STRING"},{"type":"default","dimension":"partition","outputName":"_partition","outputType":"STRING"}],"aggregations":[{"type":"doubleMax","name":"value_max","fieldName":"value_max","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":135135,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967242210,"queryId":"0da9c4d5-56ce-4862-a226-d0a081d82ca8","resultAsArray":true,"sortByDimsFirst":false,"sortResults":false,"timeout":59995},"descending":false}    {"query/time":102,"query/bytes":6599,"success":true,"identity":"druid_system"}
2020-04-15T16:13:02,474 INFO [qtp930569137-103] org.apache.druid.server.log.LoggingRequestLogger - 2020-04-15T16:13:02.350Z 100.96.188.16   {"queryType":"groupBy","dataSource":{"type":"table","name":"telemetry_prod"},"intervals":{"type":"segments","segments":[{"itvl":"2020-04-15T16:11:01.000Z/2020-04-15T16:12:01.000Z","ver":"2020-04-15T16:00:00.074Z","part":76}]},"virtualColumns":[],"filter":{"type":"and","fields":[{"type":"selector","dimension":"name","value":"io.confluent.kafka.server/tenant/active_authenticated_connection_count","extractionFn":null},{"type":"selector","dimension":"tenant","value":"lkc-****","extractionFn":null}]},"granularity":"MINUTE","dimensions":[{"type":"default","dimension":"tenant","outputName":"metric.label.cluster_id","outputType":"STRING"},{"type":"default","dimension":"broker_id","outputName":"_broker_id","outputType":"STRING"},{"type":"default","dimension":"user","outputName":"_user","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"_sum","fieldName":"value_sum","expression":null},{"type":"doubleSum","name":"_count","fieldName":"count","expression":null}],"postAggregations":[{"type":"arithmetic","name":"value_mean","fn":"/","fields":[{"type":"fieldAccess","name":null,"fieldName":"_sum"},{"type":"fieldAccess","name":null,"fieldName":"_count"}],"ordering":null}],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":138888,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1586967242210,"queryId":"8773b837-6f42-4595-a625-02ee0ea6c1cf","resultAsArray":true,"sortByDimsFirst":false,"sortResults":false,"timeout":59996},"descending":false}    {"query/time":123,"query/bytes":6,"success":true,"identity":"druid_system"}
licl2014 commented 4 years ago

Interesting. AFAIT, this shouldn't happen if the task successfully handed off all created segments. Also, this NPE doesn't do anything with the task status. It should be just a query failure. I suspect the task failed to hand off segments for some reason which made the task failed. Do you see anything interesting in the task logs?

yes, this NPE could cause query failed, how to fixed it ? @jihoonson

jihoonson commented 4 years ago

Hmm, seems like null checks are missing. I'm not sure if there is a workaround other than fixing the code.

saifat29 commented 4 years ago

I too faced the same issue on Druid v0.17.0 - Data is ingested into Druid through Kafka realtime ingestion with rollups enabled. Suddenly the task failed and segments got lost.

From the below logs it is evident that there was a null exception in the group by query.

Attaching the surrounding relevant logs-

2020-08-21T03:14:31,451 INFO [[index_kafka_aggregations_95b07d728be26d3_iacokiha]-publish] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Saved sequence metadata to disk: [SequenceMetadata{sequenceId=2, sequenceName='index_kafka_aggregations_95b07d728be26d3_2', assignments=[], startOffsets={0=2073149116, 1=2073148653}, exclusiveStartPartitions=[], endOffsets={0=2077740467, 1=2077737653}, sentinel=false, checkpointed=true}]
2020-08-21T03:14:31,451 INFO [[index_kafka_aggregations_95b07d728be26d3_iacokiha]-publish] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Adding SegmentHandoffCallback for dataSource[aggregations] Segment[SegmentDescriptor{interval=2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z, version='2020-08-21T02:00:10.526Z', partitionNumber=1}]
2020-08-21T03:15:00,801 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z, version='2020-08-21T02:00:10.526Z', partitionNumber=1}]]
2020-08-21T03:16:00,784 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Segment Handoff complete for dataSource[aggregations] Segment[SegmentDescriptor{interval=2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z, version='2020-08-21T02:00:10.526Z', partitionNumber=1}]
2020-08-21T03:16:00,787 INFO [[index_kafka_aggregations_95b07d728be26d3_iacokiha]-appenderator-persist] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[aggregations_2020-08-21T02:00:00.000Z_2020-08-21T03:00:00.000Z_2020-08-21T02:00:10.526Z_1] at path[/druid/segments/middlemanager:8102/middlemanager:8102_indexer-executor__default_tier_2020-08-21T01:46:10.992Z_72a1a95bc07543efaa1fe266685d72e40]
2020-08-21T03:16:00,832 ERROR [processing-0] org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2 - Exception with one of the sequences!
java.lang.NullPointerException: null
    at org.apache.druid.segment.realtime.FireHydrant.getIncrementedSegment(FireHydrant.java:78) ~[druid-server-0.17.0.jar:0.17.0]
    at org.apache.druid.segment.realtime.FireHydrant.getAndIncrementSegment(FireHydrant.java:138) ~[druid-server-0.17.0.jar:0.17.0]
    at org.apache.druid.segment.realtime.appenderator.SinkQuerySegmentWalker.lambda$null$0(SinkQuerySegmentWalker.java:212) ~[druid-server-0.17.0.jar:0.17.0]
    at com.google.common.collect.Iterators$8.transform(Iterators.java:794) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) ~[guava-16.0.1.jar:?]
    at org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:56) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:46) ~[druid-processing-0.17.0.jar:0.17.0]
    at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Iterators.addAll(Iterators.java:356) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Lists.newArrayList(Lists.java:147) ~[guava-16.0.1.jar:?]
    at com.google.common.collect.Lists.newArrayList(Lists.java:129) ~[guava-16.0.1.jar:?]
    at org.apache.druid.query.ChainedExecutionQueryRunner$1.make(ChainedExecutionQueryRunner.java:104) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:39) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner$1.accumulate(SpecificSegmentQueryRunner.java:79) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:163) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner.access$100(SpecificSegmentQueryRunner.java:42) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.query.spec.SpecificSegmentQueryRunner$2.wrap(SpecificSegmentQueryRunner.java:145) ~[druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45) ~[druid-core-0.17.0.jar:0.17.0]
    at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:246) [druid-processing-0.17.0.jar:0.17.0]
    at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:233) [druid-processing-0.17.0.jar:0.17.0]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_232]
    at org.apache.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:247) [druid-processing-0.17.0.jar:0.17.0]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
2020-08-21T03:16:00,873 ERROR [qtp1136890657-129[groupBy_[aggregations]_17bc90cf-a1d9-4380-bcc5-a1b565b5c91a]] org.apache.druid.server.QueryResource - Exception handling request: {class=org.apache.druid.server.QueryResource, exceptionType=class java.lang.RuntimeException, exceptionMessage=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException, query={"queryType":"groupBy","dataSource":{"type":"table","name":"aggregations"},"intervals":{"type":"segments","segments":[{"itvl":"2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z","ver":"2020-08-21T02:00:10.526Z","part":1},{"itvl":"2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z","ver":"2020-08-21T02:00:10.526Z","part":2}]},"virtualColumns":[],"filter":null,"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"d_id","outputName":"d0","outputType":"LONG"}],"aggregations":[{"type":"doubleSum","name":"a0","fieldName":"s_dr","expression":null}],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"fudgeTimestamp":"1597968000000","groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":3333333,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1597980040147,"queryId":"17bc90cf-a1d9-4380-bcc5-a1b565b5c91a","resultAsArray":true,"sqlQueryId":"a371d117-bffa-4a5f-9115-d8812eb489b0","timeout":299998},"descending":false}, peer=10.0.7.4} (java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException)
2020-08-21T03:16:01,000 INFO [[index_kafka_aggregations_95b07d728be26d3_iacokiha]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Dropped segment[aggregations_2020-08-21T02:00:00.000Z_2020-08-21T03:00:00.000Z_2020-08-21T02:00:10.526Z_1].
2020-08-21T03:16:16,707 INFO [parent-monitor-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Triggering JVM shutdown.
2020-08-21T03:16:16,708 INFO [Thread-70] org.apache.druid.cli.CliPeon - Running shutdown hook
2020-08-21T03:16:16,708 INFO [Thread-70] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [ANNOUNCEMENTS]
2020-08-21T03:16:16,709 INFO [Thread-70] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/announcements/middlemanager:8102]
2020-08-21T03:16:16,722 INFO [Thread-70] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/listeners/lookups/__default/http:middlemanager:8102]
2020-08-21T03:16:16,723 INFO [Thread-70] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/segments/middlemanager:8102/middlemanager:8102_indexer-executor__default_tier_2020-08-21T01:46:10.992Z_72a1a95bc07543efaa1fe266685d72e40]
2020-08-21T03:16:16,724 INFO [Thread-70] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/internal-discovery/PEON/middlemanager:8102]
2020-08-21T03:16:16,725 INFO [Thread-70] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [SERVER]
2020-08-21T03:16:16,733 INFO [Thread-70] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@b2da3a5{HTTP/1.1,[http/1.1]}{0.0.0.0:8102}
2020-08-21T03:16:16,734 INFO [Thread-70] org.eclipse.jetty.server.session - node0 Stopped scavenging
2020-08-21T03:16:16,736 INFO [Thread-70] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@1b485c8b{/,null,UNAVAILABLE}
2020-08-21T03:16:16,740 INFO [Thread-70] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [NORMAL]
2020-08-21T03:16:16,741 INFO [Thread-70] org.apache.druid.server.listener.announcer.ListenerResourceAnnouncer - Unannouncing start time on [/druid/listeners/lookups/__default/http:middlemanager:8102]
2020-08-21T03:16:16,743 INFO [Thread-70] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting graceful shutdown of task[index_kafka_aggregations_95b07d728be26d3_iacokiha].
2020-08-21T03:16:16,743 INFO [Thread-70] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Stopping forcefully (status: [PUBLISHING])
2020-08-21T03:16:16,750 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Error while publishing segments for sequenceNumber[SequenceMetadata{sequenceId=2, sequenceName='index_kafka_aggregations_95b07d728be26d3_2', assignments=[], startOffsets={0=2073149116, 1=2073148653}, exclusiveStartPartitions=[], endOffsets={0=2077740467, 1=2077737653}, sentinel=false, checkpointed=true}]
java.util.concurrent.CancellationException: Task was cancelled.
    at com.google.common.util.concurrent.AbstractFuture.cancellationExceptionWithCause(AbstractFuture.java:392) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.Futures$4.run(Futures.java:1170) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture.cancel(AbstractFuture.java:134) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.Futures$ChainingListenableFuture.cancel(Futures.java:826) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.Futures$CombinedFuture$1.run(Futures.java:1505) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) [guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture.cancel(AbstractFuture.java:134) [guava-16.0.1.jar:?]
    at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:891) [druid-indexing-service-0.17.0.jar:0.17.0]
    at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:278) [druid-indexing-service-0.17.0.jar:0.17.0]
    at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:164) [druid-indexing-service-0.17.0.jar:0.17.0]
    at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.17.0.jar:0.17.0]
    at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.17.0.jar:0.17.0]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_232]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: java.util.concurrent.CancellationException: Future.cancel() was called.
    at com.google.common.util.concurrent.AbstractFuture$Sync.complete(AbstractFuture.java:378) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture$Sync.cancel(AbstractFuture.java:355) ~[guava-16.0.1.jar:?]
    at com.google.common.util.concurrent.AbstractFuture.cancel(AbstractFuture.java:131) ~[guava-16.0.1.jar:?]
    ... 15 more
2020-08-21T03:16:16,752 INFO [Thread-70] org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager - CoordinatorPollingBasicAuthenticatorCacheManager is stopping.
2020-08-21T03:16:16,752 INFO [Thread-70] org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager - CoordinatorPollingBasicAuthenticatorCacheManager is stopped.
2020-08-21T03:16:16,752 INFO [Thread-70] org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager - CoordinatorPollingBasicAuthorizerCacheManager is stopping.
2020-08-21T03:16:16,752 INFO [task-runner-0-priority-0] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[aggregations_2020-08-21T02:00:00.000Z_2020-08-21T03:00:00.000Z_2020-08-21T02:00:10.526Z_2] at path[/druid/segments/middlemanager:8102/middlemanager:8102_indexer-executor__default_tier_2020-08-21T01:46:10.992Z_72a1a95bc07543efaa1fe266685d72e40]
2020-08-21T03:16:16,752 INFO [Thread-70] org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager - CoordinatorPollingBasicAuthorizerCacheManager is stopped.
2020-08-21T03:16:16,752 INFO [LookupExtractorFactoryContainerProvider-MainThread] org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited. Lookup notices are not handled anymore.
2020-08-21T03:16:16,753 INFO [task-runner-0-priority-0] org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced self [{"druidNode":{"service":"druid/middleManager","host":"middlemanager","bindOnHost":false,"plaintextPort":8102,"port":-1,"tlsPort":-1,"enablePlaintextPort":true,"enableTlsPort":false},"nodeType":"peon","services":{"dataNodeService":{"type":"dataNodeService","tier":"_default_tier","maxSize":0,"type":"indexer-executor","priority":0},"lookupNodeService":{"type":"lookupNodeService","lookupTier":"__default"}}}].
2020-08-21T03:16:16,771 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_kafka_aggregations_95b07d728be26d3_iacokiha",
  "status" : "SUCCESS",
  "duration" : 5416234,
  "errorMsg" : null,
  "location" : {
    "host" : null,
    "port" : -1,
    "tlsPort" : -1
  }
}
2020-08-21T03:16:16,774 INFO [main] org.apache.druid.cli.CliPeon - Thread [Thread[Thread-70,5,main]] is non daemon.
2020-08-21T03:16:16,774 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Lifecycle [module] already stopped and stop was called. Silently skipping
Cannot remove shutdown hook, already shutting down!
Finished peon task
2020-08-21T03:16:16,801 INFO [Curator-Framework-0] org.apache.curator.framework.imps.CuratorFrameworkImpl - backgroundOperationsLoop exiting
2020-08-21T03:16:16,804 INFO [Thread-70] org.apache.zookeeper.ZooKeeper - Session: 0x1013a00d03800a8 closed
2020-08-21T03:16:16,804 INFO [main-EventThread] org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1013a00d03800a8
2020-08-21T03:16:16,804 INFO [Thread-70] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [INIT]

This issue is very critical as ingestion tasks have become a gamble.

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.