apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.93k stars 980 forks source link

NPE on DeltaRowGroupScan #2810

Closed kmatt closed 1 year ago

kmatt commented 1 year ago

Describe the bug SELECT * on Delta table (Parquet) throws null pointer exception

To Reproduce Steps to reproduce the behavior:

  1. Local files on single Drillbit created by PySpark
  2. SELECT COUNT(*) succeeds
  3. SELECT * throws NPE

Error detail, log output or screenshots

2023-06-20 18:58:19,058 [1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id 1b6e0933-dd1c-f16b-f6af-dd466d5d94f2 issued by mattk: ALTER SESSION SET `exec.query.max_rows`=1000
2023-06-20 18:58:19,068 [1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:0:0: State change requested AWAITING_ALLOCATION --> RUNNING
2023-06-20 18:58:19,068 [1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:frag:0:0] INFO  o.a.d.e.w.f.FragmentStatusReporter - 1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:0:0: State to report: RUNNING
2023-06-20 18:58:19,118 [1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:0:0: State change requested RUNNING --> FINISHED
2023-06-20 18:58:19,118 [1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:frag:0:0] INFO  o.a.d.e.w.f.FragmentStatusReporter - 1b6e0933-dd1c-f16b-f6af-dd466d5d94f2:0:0: State to report: FINISHED
2023-06-20 18:58:19,137 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id 1b6e0933-c599-8d17-8971-5b0c2ecefac7 issued by mattk: select *
from table(delta.root.`Warehouse/dbo/DeltaTestTable` (type => 'delta'))
limit 5
2023-06-20 18:58:23,037 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:1] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:1: State change requested AWAITING_ALLOCATION --> FAILED
2023-06-20 18:58:23,037 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:0: State change requested AWAITING_ALLOCATION --> FAILED
2023-06-20 18:58:23,037 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:1] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:1: State change requested FAILED --> FINISHED
2023-06-20 18:58:23,037 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:0: State change requested FAILED --> FINISHED
2023-06-20 18:58:23,038 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:3] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:3: State change requested AWAITING_ALLOCATION --> FAILED
2023-06-20 18:58:23,037 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:1] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException

Fragment: 1:1

Please, refer to logs for more information.

[Error Id: c6b09027-199a-46e1-abb8-f37576c50382 on vm-etl-01:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: NullPointerException

Fragment: 1:1

Please, refer to logs for more information.

[Error Id: c6b09027-199a-46e1-abb8-f37576c50382 on vm-etl-01:31010]
    at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:688)
    at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:392)
    at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:244)
    at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:359)
    at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of `org.apache.drill.exec.store.delta.DeltaRowGroupScan`, problem: `java.lang.NullPointerException`
 at [Source: (String)"{
  "pop" : "single-sender",
  "@id" : 0,
  "receiver-major-fragment" : 0,
  "receiver-minor-fragment" : 0,
  "child" : {
    "pop" : "selection-vector-remover",
    "@id" : 1,
    "child" : {
      "pop" : "limit",
      "@id" : 2,
      "child" : {
        "pop" : "delta-row-group-scan",
        "@id" : 3,
        "userName" : "mattk",
        "formatPluginConfig" : {
          "type" : "delta",
          "version" : null,
          "timestamp" : null
        },
        "rowGroupReadEntries"[truncated 18683 chars]; line: 467, column: 7] (through reference chain: org.apache.drill.exec.physical.config.SingleSender["child"]->org.apache.drill.exec.physical.config.SelectionVectorRemover["child"]->org.apache.drill.exec.physical.config.Limit["child"])
    at com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
    at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:2052)
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:587)
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:610)
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:293)
    at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:288)
    at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:202)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:519)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:352)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId(BeanDeserializerBase.java:1371)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:218)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:144)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:110)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:263)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:539)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:564)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:439)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:352)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId(BeanDeserializerBase.java:1371)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:218)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:144)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:110)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:263)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:539)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:564)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:439)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:352)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId(BeanDeserializerBase.java:1371)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:218)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:144)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:110)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:263)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:539)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:564)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:439)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:352)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId(BeanDeserializerBase.java:1371)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:218)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:144)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:110)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:263)
    at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:74)
    at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
    at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2105)
    at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1546)
    at org.apache.drill.exec.planner.PhysicalPlanReader.readFragmentRoot(PhysicalPlanReader.java:103)
    at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:288)
    ... 4 common frames omitted
Caused by: java.lang.NullPointerException: null
    at org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:878)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache.get(LocalCache.java:3950)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
    at org.apache.drill.shaded.guava.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
    at org.apache.drill.exec.store.StoragePluginRegistryImpl.getPluginByConfig(StoragePluginRegistryImpl.java:701)
    at org.apache.drill.exec.store.StoragePluginRegistryImpl.getFormatPluginByConfig(StoragePluginRegistryImpl.java:845)
    at org.apache.drill.exec.store.StoragePluginRegistryImpl.resolveFormat(StoragePluginRegistryImpl.java:968)
    at org.apache.drill.exec.store.delta.DeltaRowGroupScan.<init>(DeltaRowGroupScan.java:65)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
    at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:128)
    at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:291)
    ... 54 common frames omitted
2023-06-20 18:58:23,038 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:3] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:3: State change requested FAILED --> FINISHED
2023-06-20 18:58:23,038 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:4] INFO  o.a.d.e.w.fragment.FragmentExecutor - 1b6e0933-c599-8d17-8971-5b0c2ecefac7:1:4: State change requested AWAITING_ALLOCATION --> FAILED
2023-06-20 18:58:23,037 [1b6e0933-c599-8d17-8971-5b0c2ecefac7:frag:1:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException

Fragment: 1:0

Drill version

1.21.1

Additional context

pyspark 3.4.0 delta-spark 2.4.0 Ubuntu 22.04.2 LTS

kmatt commented 1 year ago

@vvysotskyi also reported on https://issues.apache.org/jira/projects/DRILL/issues/DRILL-8442

kmatt commented 1 year ago

Built drill-1.21.1 with #2811 and SELECT * query now runs successfully.

anthonyhungnguyen commented 3 months ago

I still get this issue with drill-1.21.1.