ExpediaGroup / waggle-dance

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Apache License 2.0
273 stars 76 forks source link

Presto query on table with Bloom Filter #122

Closed rwcrooks closed 6 years ago

rwcrooks commented 6 years ago

When running a Presto Query on a table using a Bloom Filter Waggle Dance throws an error.

BLOOM_FILTER stream type not implemented yet

See attached for further details. BloomFilter.txt


Query 20180810_131848_00003_efzwa failed: com.facebook.presto.spi.PrestoException
BLOOM_FILTER stream type not implemented yet
com.facebook.presto.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:243)
com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:231)
com.facebook.presto.operator.Driver.processInternal(Driver.java:380)
com.facebook.presto.operator.Driver.processFor(Driver.java:303)
com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:577)
com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:529)
com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:665)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException
BLOOM_FILTER stream type not implemented yet
com.facebook.presto.orc.metadata.OrcMetadataReader.toStreamKind(OrcMetadataReader.java:397)
com.facebook.presto.orc.metadata.OrcMetadataReader.toStream(OrcMetadataReader.java:118)
com.google.common.collect.Iterators$8.transform(Iterators.java:799)
com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
com.google.common.collect.ImmutableCollection$Builder.addAll(ImmutableCollection.java:301)
com.google.common.collect.ImmutableList$Builder.addAll(ImmutableList.java:691)
com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:275)
com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:226)
com.facebook.presto.orc.metadata.OrcMetadataReader.toStream(OrcMetadataReader.java:123)
com.facebook.presto.orc.metadata.OrcMetadataReader.readStripeFooter(OrcMetadataReader.java:113)
com.facebook.presto.orc.StripeReader.readStripeFooter(StripeReader.java:325)
com.facebook.presto.orc.StripeReader.readStripe(StripeReader.java:102)
com.facebook.presto.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:369)
com.facebook.presto.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:326)
com.facebook.presto.orc.OrcRecordReader.nextBatch(OrcRecordReader.java:290)
com.facebook.presto.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:219)
com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:231)
com.facebook.presto.operator.Driver.processInternal(Driver.java:380)
com.facebook.presto.operator.Driver.processFor(Driver.java:303)
com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:577)
com.facebook.presto.execution.TaskExecutor$PrioritizedSplitRunner.process(TaskExecutor.java:529)
com.facebook.presto.execution.TaskExecutor$Runner.run(TaskExecutor.java:665)
``
teabot commented 6 years ago

To confirm, if you go directly to the metastore (and not via WD), does this query work on the same execution platform, with the same dataset? From the stack trace, it looks like this is happening long after metadata has been fetched via WD and actually when the task is reading the underlying data.

rwcrooks commented 6 years ago

@teabot Yes, that's correct. If I go to the source metastore the query runs without issue (same execution platform and dataset).

teabot commented 6 years ago

Intriguing! Could you possibly provide use with a minimal set of scripts to reproduce (set up table of correct format, query data, etc.)? Also, as much of the following would be useful:

teabot commented 6 years ago

Relevant perhaps: https://github.com/prestodb/presto/pull/5998

rwcrooks commented 6 years ago

@teabot helped me figure this one out - the issue is due to the version of Presto we are using on Qubole (.143) which doesn't support Bloom Filters. Upgrading to version .180 resolved the problem!