trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
9.88k stars 2.86k forks source link

Max partition issue occurs when projection pushdown is disabled #4684

Open ebyhr opened 3 years ago

ebyhr commented 3 years ago

This issue happens in version >= 337. Steps to reproduce:

  1. Add these properties

    hive.projection-pushdown-enabled=false
    hive.max-partitions-per-scan=3
  2. Run following queries

    
    create table test_part (c1 int, c2 row(a int), c3 int, c4 int)  WITH (
    format = 'ORC',
    partitioned_by = ARRAY['c3','c4']
    );

insert into test_part values (1, row(1), 1, 1); insert into test_part values (2, row(2), 2, 2); insert into test_part values (3, row(3), 3, 3); insert into test_part values (4, row(2), 3, 4);

set session hive.projection_pushdown_enabled = false;

with t1(c2) as ( values (1) ), t2 as ( select c1, c2.a as user_id from test_part where c3 = 1 and c4= 1 ), t3 as ( select t2.user_id from t1 left outer join t2 on ( t2.c1 = t1.c2 ) ) select * from t3

Query 20200804_144919_00007_ds5qm failed: Query over table 'default.test_part' can potentially read more than 3 partitions io.prestosql.spi.PrestoException: Query over table 'default.test_part' can potentially read more than 3 partitions at io.prestosql.plugin.hive.HivePartitionManager.getPartitionsAsList(HivePartitionManager.java:203) at io.prestosql.plugin.hive.HivePartitionManager.lambda$getOrLoadPartitions$12(HivePartitionManager.java:235) at java.base/java.util.Optional.orElseGet(Optional.java:369) at io.prestosql.plugin.hive.HivePartitionManager.getOrLoadPartitions(HivePartitionManager.java:234) at io.prestosql.plugin.hive.HiveMetadata.getTableProperties(HiveMetadata.java:1937) at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.getTableProperties(ClassLoaderSafeConnectorMetadata.java:672) at io.prestosql.metadata.MetadataManager.getTableProperties(MetadataManager.java:434) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitTableScan(EffectivePredicateExtractor.java:247) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitTableScan(EffectivePredicateExtractor.java:108) at io.prestosql.sql.planner.plan.TableScanNode.accept(TableScanNode.java:131) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitProject(EffectivePredicateExtractor.java:188) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitProject(EffectivePredicateExtractor.java:108) at io.prestosql.sql.planner.plan.ProjectNode.accept(ProjectNode.java:82) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitFilter(EffectivePredicateExtractor.java:154) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitFilter(EffectivePredicateExtractor.java:108) at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitProject(EffectivePredicateExtractor.java:188) at io.prestosql.sql.planner.EffectivePredicateExtractor$Visitor.visitProject(EffectivePredicateExtractor.java:108) at io.prestosql.sql.planner.plan.ProjectNode.accept(ProjectNode.java:82) at io.prestosql.sql.planner.EffectivePredicateExtractor.extract(EffectivePredicateExtractor.java:105) at io.prestosql.sql.planner.optimizations.PredicatePushDown$Rewriter.visitJoin(PredicatePushDown.java:401) at io.prestosql.sql.planner.optimizations.PredicatePushDown$Rewriter.visitJoin(PredicatePushDown.java:135) at io.prestosql.sql.planner.plan.JoinNode.accept(JoinNode.java:320) at io.prestosql.sql.planner.plan.SimplePlanRewriter$RewriteContext.rewrite(SimplePlanRewriter.java:84) at io.prestosql.sql.planner.plan.SimplePlanRewriter$RewriteContext.lambda$defaultRewrite$0(SimplePlanRewriter.java:73) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) at java.base/java.util.Collections$2.tryAdvance(Collections.java:4747) at java.base/java.util.Collections$2.forEachRemaining(Collections.java:4755) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) at io.prestosql.sql.planner.plan.SimplePlanRewriter$RewriteContext.defaultRewrite(SimplePlanRewriter.java:74) at io.prestosql.sql.planner.optimizations.PredicatePushDown$Rewriter.visitPlan(PredicatePushDown.java:175) at io.prestosql.sql.planner.optimizations.PredicatePushDown$Rewriter.visitPlan(PredicatePushDown.java:135) at io.prestosql.sql.planner.plan.PlanVisitor.visitOutput(PlanVisitor.java:49) at io.prestosql.sql.planner.plan.OutputNode.accept(OutputNode.java:83) at io.prestosql.sql.planner.plan.SimplePlanRewriter.rewriteWith(SimplePlanRewriter.java:32) at io.prestosql.sql.planner.optimizations.PredicatePushDown.optimize(PredicatePushDown.java:129) at io.prestosql.sql.planner.optimizations.StatsRecordingPlanOptimizer.optimize(StatsRecordingPlanOptimizer.java:52) at io.prestosql.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:200) at io.prestosql.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:189) at io.prestosql.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:184) at io.prestosql.execution.SqlQueryExecution.doPlanQuery(SqlQueryExecution.java:425) at io.prestosql.execution.SqlQueryExecution.planQuery(SqlQueryExecution.java:413) at io.prestosql.execution.SqlQueryExecution.start(SqlQueryExecution.java:365) at io.prestosql.execution.SqlQueryManager.createQuery(SqlQueryManager.java:245) at io.prestosql.dispatcher.LocalDispatchQuery.lambda$startExecution$7(LocalDispatchQuery.java:132) at io.prestosql.$gen.Presto_339____20200804_144828_2.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)



Confirmed reverting https://github.com/prestosql/presto/pull/4156 solves the error. 
findepi commented 3 years ago

hive.projection-pushdown-enabled=false

why would you want to do that?

ebyhr commented 3 years ago

@findepi If I remember correctly, we disabled to avoid an issue described in #3967. Let me confirm in upstream.

ebyhr commented 3 years ago

Confirmed this issue doesn't happen in version 357. We can close after adding a test.