prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
15.75k stars 5.29k forks source link

Negative array subscript being pushdown as subfield pruning and worker is not checking against it #22690

Open Yuhta opened 1 month ago

Yuhta commented 1 month ago

We are pushing down negative array subscript as subfield pruning (https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/PushdownSubfields.java) and worker is not sanity checking against it (https://github.com/prestodb/presto/blob/master/presto-orc/src/main/java/com/facebook/presto/orc/reader/ListSelectiveStreamReader.java ). With -1 pushdown this happens to work because ELEMENT_LENGTH_UNBOUNDED = -1 but -2 is resulting in GENERIC_INTERNAL_ERROR:

java.lang.IllegalArgumentException: Offset is not monotonically ascending. offsets[0]=0, offsets[1]=-2
    at com.facebook.presto.common.block.ArrayBlock.fromElementBlock(ArrayBlock.java:58)
    at com.facebook.presto.orc.reader.ListSelectiveStreamReader.getBlock(ListSelectiveStreamReader.java:530)
    at com.facebook.presto.orc.OrcSelectiveRecordReader$OrcBlockLoader.load(OrcSelectiveRecordReader.java:941)
    at com.facebook.presto.orc.OrcSelectiveRecordReader$OrcBlockLoader.load(OrcSelectiveRecordReader.java:900)
    at com.facebook.presto.common.block.LazyBlock.assureLoaded(LazyBlock.java:313)
    at com.facebook.presto.common.block.LazyBlock.getLoadedBlock(LazyBlock.java:304)
    at com.facebook.presto.operator.ScanFilterAndProjectOperator$RecordingLazyBlockLoader.load(ScanFilterAndProjectOperator.java:335)
    at com.facebook.presto.operator.ScanFilterAndProjectOperator$RecordingLazyBlockLoader.load(ScanFilterAndProjectOperator.java:321)
    at com.facebook.presto.common.block.LazyBlock.assureLoaded(LazyBlock.java:313)
    at com.facebook.presto.common.block.LazyBlock.getLoadedBlock(LazyBlock.java:304)
    at com.facebook.presto.operator.project.DictionaryAwarePageProjection$DictionaryAwarePageProjectionWork.<init>(DictionaryAwarePageProjection.java:92)
    at com.facebook.presto.operator.project.DictionaryAwarePageProjection.project(DictionaryAwarePageProjection.java:70)
    at com.facebook.presto.operator.project.PageProjectionWithOutputs.project(PageProjectionWithOutputs.java:56)
    at com.facebook.presto.operator.project.PageProcessor$ProjectSelectedPositions.processBatch(PageProcessor.java:327)
    at com.facebook.presto.operator.project.PageProcessor$ProjectSelectedPositions.process(PageProcessor.java:201)
    at com.facebook.presto.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:315)
    at com.facebook.presto.operator.WorkProcessorUtils$YieldingIterator.computeNext(WorkProcessorUtils.java:79)
    at com.facebook.presto.operator.WorkProcessorUtils$YieldingIterator.computeNext(WorkProcessorUtils.java:65)
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
    at com.facebook.presto.operator.project.MergingPageOutput.getOutput(MergingPageOutput.java:128)
    at com.facebook.presto.operator.ScanFilterAndProjectOperator.processPageSource(ScanFilterAndProjectOperator.java:316)
    at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:260)
    at com.facebook.presto.operator.Driver.processInternal(Driver.java:441)
    at com.facebook.presto.operator.Driver.lambda$processFor$10(Driver.java:324)
    at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:750)
    at com.facebook.presto.operator.Driver.processFor(Driver.java:317)
    at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1079)
    at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:165)
    at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:621)
    at com.facebook.presto.$gen.Presto_0_288_edge1_2_SNAPSHOT_$_git_commit_id_abbrev___0_288_edge1_2____20240504_213846_1.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)

We should stop push down non-positive array subscripts.

Yuhta commented 1 month ago

CC: @mbasmanova @amitkdutta

mbasmanova commented 1 month ago

CC: @yingsu00 @rschlussel @tdcmeehan @aditi-pandit

tdcmeehan commented 1 month ago

@Yuhta do you have a query to reproduce this issue?

Yuhta commented 1 month ago

@tdcmeehan Any query containing element_at(c0, -2) where c0 is array type and only referenced in this function would repro