trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.34k stars 2.98k forks source link

Support Hive VIEW with common table expression (WITH CTE) #5977

Closed findepi closed 3 years ago

findepi commented 3 years ago
0: jdbc:hive2://localhost:10000/default> CREATE VIEW v AS WITH nat AS (SELECT * FROM nation) SELECT * FROM nat;
No rows affected (0.338 seconds)

0: jdbc:hive2://localhost:10000/default> SELECT * FROM v WHERE v.n_nationkey < 3;
+----------------+------------+----------------+--------------------------------------------------------------------------------------------------------------+--+
| v.n_nationkey  |  v.n_name  | v.n_regionkey  |                                                 v.n_comment                                                  |
+----------------+------------+----------------+--------------------------------------------------------------------------------------------------------------+--+
| 0              | ALGERIA    | 0              |  haggle. carefully final deposits detect slyly agai                                                          |
| 1              | ARGENTINA  | 1              | al foxes promise slyly according to the regular accounts. bold requests alon                                 |
| 2              | BRAZIL     | 1              | y alongside of the pending deposits. carefully special packages are about the ironic forges. slyly special   |
+----------------+------------+----------------+--------------------------------------------------------------------------------------------------------------+--+
presto:default> SELECT * FROM v WHERE v.n_nationkey < 3;
Query 20201116_162909_00039_jfdaf failed: Failed to translate Hive view 'default.v': Unhandled Hive AST token TOK_CTE, tree:
TOK_CTE
   TOK_SUBQUERY
      TOK_QUERY
         TOK_FROM
            TOK_TABREF
               TOK_TABNAME
                  default
                  nation
         TOK_INSERT
            TOK_DESTINATION
               TOK_DIR
                  TOK_TMP_FILE
            TOK_SELECT
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_nationkey
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_name
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_regionkey
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_comment
      nat

io.prestosql.spi.PrestoException: Failed to translate Hive view 'default.v': Unhandled Hive AST token TOK_CTE, tree:
TOK_CTE
   TOK_SUBQUERY
      TOK_QUERY
         TOK_FROM
            TOK_TABREF
               TOK_TABNAME
                  default
                  nation
         TOK_INSERT
            TOK_DESTINATION
               TOK_DIR
                  TOK_TMP_FILE
            TOK_SELECT
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_nationkey
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_name
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_regionkey
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_comment
      nat

    at io.prestosql.plugin.hive.ViewReaderUtil$HiveViewReader.decodeViewData(ViewReaderUtil.java:180)
    at io.prestosql.plugin.hive.HiveMetadata.lambda$getView$57(HiveMetadata.java:1934)
    at java.base/java.util.Optional.map(Optional.java:265)
    at io.prestosql.plugin.hive.HiveMetadata.getView(HiveMetadata.java:1928)
    at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.getView(ClassLoaderSafeConnectorMetadata.java:522)
    at io.prestosql.metadata.MetadataManager.getView(MetadataManager.java:1095)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:1209)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:324)
    at io.prestosql.sql.tree.Table.accept(Table.java:53)
    at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:341)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.analyzeFrom(StatementAnalyzer.java:2510)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:1532)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:324)
    at io.prestosql.sql.tree.QuerySpecification.accept(QuerySpecification.java:144)
    at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:341)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:351)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:1055)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.visitQuery(StatementAnalyzer.java:324)
    at io.prestosql.sql.tree.Query.accept(Query.java:107)
    at io.prestosql.sql.tree.AstVisitor.process(AstVisitor.java:27)
    at io.prestosql.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:341)
    at io.prestosql.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:310)
    at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:83)
    at io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:75)
    at io.prestosql.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:257)
    at io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:183)
    at io.prestosql.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:759)
    at io.prestosql.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:122)
    at io.prestosql.$gen.Presto_346_65_g5309ce1____20201116_154441_2.call(Unknown Source)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: com.linkedin.coral.hive.hive2rel.parsetree.UnhandledASTTokenException: Unhandled Hive AST token TOK_CTE, tree:
TOK_CTE
   TOK_SUBQUERY
      TOK_QUERY
         TOK_FROM
            TOK_TABREF
               TOK_TABNAME
                  default
                  nation
         TOK_INSERT
            TOK_DESTINATION
               TOK_DIR
                  TOK_TMP_FILE
            TOK_SELECT
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_nationkey
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_name
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_regionkey
               TOK_SELEXPR
                  .
                     TOK_TABLE_OR_COL
                        nation
                     n_comment
      nat

    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visit(AbstractASTVisitor.java:265)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.lambda$visitChildren$0(AbstractASTVisitor.java:284)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visitChildren(AbstractASTVisitor.java:285)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visitChildren(AbstractASTVisitor.java:279)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.visitQueryNode(ParseTreeBuilder.java:611)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.visitQueryNode(ParseTreeBuilder.java:70)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visit(AbstractASTVisitor.java:64)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.lambda$visitChildren$0(AbstractASTVisitor.java:284)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visitChildren(AbstractASTVisitor.java:285)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visitChildren(AbstractASTVisitor.java:279)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.visitNil(ParseTreeBuilder.java:617)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.visitNil(ParseTreeBuilder.java:70)
    at com.linkedin.coral.hive.hive2rel.parsetree.AbstractASTVisitor.visit(AbstractASTVisitor.java:45)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.processAST(ParseTreeBuilder.java:167)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.process(ParseTreeBuilder.java:159)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.processViewOrTable(ParseTreeBuilder.java:127)
    at com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.processView(ParseTreeBuilder.java:142)
    at com.linkedin.coral.hive.hive2rel.HiveToRelConverter.convertView(HiveToRelConverter.java:103)
    at io.prestosql.plugin.hive.ViewReaderUtil$HiveViewReader.decodeViewData(ViewReaderUtil.java:161)
    ... 36 more
findepi commented 3 years ago

cc @laurachenyu @phd3 @wmoustafa

findepi commented 3 years ago

Since it worked in Presto 344, i am marking this as a bug.

martint commented 3 years ago

Since it worked in Presto 344, i am marking this as a bug.

Note that in 344, it was best effort. There was not guarantee about correctness or semantics, so I wouldn't consider it as "working".

findepi commented 3 years ago

I agree, i wouldn't call the overall complex hive view support as "working" in 344. CTE were working though, even if incidentally only. (from end-user perspective it may not matter why something is working, only whether it's working). Admittedly, the regression is failure on our side -- we did not document nor test what's actually working and what is not, so we couldn't see any regressions when doing https://github.com/prestosql/presto/pull/4661.

sajjoseph commented 3 years ago

We are impacted by this issue and we ended up rolling back our recent upgrade (hive views support is a key requirement in our environment). Really appreciate any quick resolution to this challenge.

wmoustafa commented 3 years ago

@rzhang10 started looking into this last week. We are encountering a couple of issues, but hopefully the fix will be out soon.

findepi commented 3 years ago

In the meantime, we're providing legacy behavior as an escape hatch for this and couple other regressions encountered with the introduction of Coral: https://github.com/prestosql/presto/pull/6195 This should remove blockers and help users in the short term, allowing us to focus on building better Hive VIEW support for the long term.

cc @losipiuk

wmoustafa commented 3 years ago

This is addressed in Coral now https://github.com/linkedin/coral/pull/125. Thanks to @uzshao for the patch!

martint commented 3 years ago

@wmoustafa, thanks! Is that in a release build, yet?

wmoustafa commented 3 years ago

@martint, yes, in version 1.0.83.

findepi commented 3 years ago

@uzshao that's awesome!

would you be willing to bump the coral version used in trino, along with adding a regression test in io.trino.tests.product.hive.AbstractTestHiveViews?

cc @losipiuk

hashhar commented 3 years ago

@findepi There's a PR open at https://github.com/trinodb/trino/pull/8935 which updates this all the way to 1.0.89.

findepi commented 3 years ago

@findepi There's a PR open at #8935 which updates this all the way to 1.0.89.

thanks @hashhar @ebyhr !

we should also add a test for a view with a WITH. that would close this issue.