trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.52k stars 3.03k forks source link

Read big length column will cause NegativeArraySizeException. #21443

Open Heltman opened 7 months ago

Heltman commented 7 months ago

We received an error when reading a table with a certain column that has a large length. After inspection, we found that the length exceeded the size of the Slice by 2GB.

After investigation, it is because Trino reads files in batches. The batch size is generally 4096, and our column size is 500 KB, thus exceeding the limit of Slice.

I use trino 421 version, but I guess this is a common problem. We have discussed it on slack, @wendigo @electrum

I built a minimal reproducible case as follows:

-- create string with size 500KB
select length(concat_ws(repeat('a', 1000), sequence(1,500))); -- 500392

-- use cross join unnest to explode to 5000 row, so trino will use 4096 batch size to read, and then error happen
select * from (select concat_ws(repeat('a', 1000), sequence(1,500)) aa) x cross join unnest (sequence(1,5000));

error:

java.lang.NegativeArraySizeException: -1734967296
    at io.airlift.slice.Slices.allocate(Slices.java:89)
    at io.trino.spi.block.VariableWidthBlock.copyPositions(VariableWidthBlock.java:204)
    at io.trino.spi.block.DictionaryBlockEncoding.readBlock(DictionaryBlockEncoding.java:69)
    at io.trino.metadata.InternalBlockEncodingSerde.readBlock(InternalBlockEncodingSerde.java:62)
    at io.trino.block.BlockSerdeUtil.readBlock(BlockSerdeUtil.java:43)
    at io.trino.execution.buffer.PagesSerdeUtil.readRawPage(PagesSerdeUtil.java:78)
    at io.trino.execution.buffer.PageDeserializer.deserialize(PageDeserializer.java:80)
    at io.trino.server.protocol.Query.removePagesFromExchange(Query.java:545)
    at io.trino.server.protocol.Query.getNextResult(Query.java:416)
    at io.trino.server.protocol.Query.lambda$waitForResults$2(Query.java:292)
    at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.doTransform(AbstractTransformFuture.java:252)
    at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.doTransform(AbstractTransformFuture.java:242)
    at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:123)
    at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)
Heltman commented 7 months ago

The error reporting location is as follows, but the location that needs to be processed may not be here. https://github.com/trinodb/trino/blob/4bd0f8de39512ddd6f0b52b601c48b7a9106bac3/lib/trino-parquet/src/main/java/io/trino/parquet/reader/flat/BinaryBuffer.java#L81-L99

wendigo commented 7 months ago

@raunaqmorarka can you take a look?

martint commented 7 months ago

The example query above doesn't run, but here's one that does:

SELECT *
FROM (
    SELECT concat_ws('', repeat(concat_ws('', repeat('a', 1000)), 500))
) CROSS JOIN UNNEST(sequence(1, 5000));
raunaqmorarka commented 7 months ago

@Heltman are you able to tune parquet.max-read-block-row-count lower to avoid hitting this problem ?

wendigo commented 7 months ago

@raunaqmorarka I think that this is unrelated to parquet. The example that Martin shared is not using parquet files.

jshmchenxi commented 2 months ago

We also experience this issue with Trino version 451. Turning down parquet.max-read-block-row-count config helped as a workaround. Our exception stacktrace was a bit different.

Caused by: java.lang.NegativeArraySizeException: -2139450066
    at io.airlift.slice.Slices.allocate(Slices.java:91)
    at io.trino.parquet.reader.flat.BinaryBuffer.asSlice(BinaryBuffer.java:90)
    at io.trino.parquet.reader.flat.BinaryColumnAdapter.createNonNullBlock(BinaryColumnAdapter.java:76)
    at io.trino.parquet.reader.flat.BinaryColumnAdapter.createNonNullBlock(BinaryColumnAdapter.java:27)
    at io.trino.parquet.reader.flat.FlatColumnReader$DataValuesBuffer.createNonNullBlock(FlatColumnReader.java:392)
    at io.trino.parquet.reader.flat.FlatColumnReader.readNonNull(FlatColumnReader.java:193)
    at io.trino.parquet.reader.flat.FlatColumnReader.readPrimitive(FlatColumnReader.java:90)
    at io.trino.parquet.reader.ParquetReader.readPrimitive(ParquetReader.java:463)
    at io.trino.parquet.reader.ParquetReader.readColumnChunk(ParquetReader.java:554)
    at io.trino.parquet.reader.ParquetReader.readBlock(ParquetReader.java:537)
    at io.trino.parquet.reader.ParquetReader.lambda$nextPage$3(ParquetReader.java:251)
    at io.trino.parquet.reader.ParquetBlockFactory$ParquetBlockLoader.load(ParquetBlockFactory.java:72)
    ... 43 more