trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.19k stars 2.94k forks source link

Max row size of 16MB exceeded - where is the max row size set? #22897

Open skirunshoot opened 1 month ago

skirunshoot commented 1 month ago

Creating a iceberg table as CTAS.
create iceberg.schema.table as select * from hive.schema_table. We are wrtiting parquet files on the backend.
The input table has around 800 columns.

It's failing with `SQL Error [13]: Query failed (#20240731_174447_00251_vhqki): Max row size of 16MB exceeded: 18.80MB org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error [13]: Query failed (#20240731_174447_00251_vhqki): Max row size of 16MB exceeded: 18.80MB at org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCStatementImpl.executeStatement(JDBCStatementImpl.java:133) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeStatement(SQLQueryJob.java:582) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.lambda$1(SQLQueryJob.java:491) at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:190) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:498) at org.jkiss.dbeaver.ui.editors.sql.execute.SQLQueryJob.extractData(SQLQueryJob.java:934) at org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsContainer.readData(SQLEditor.java:3937) at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.lambda$0(ResultSetJobDataRead.java:123) at org.jkiss.dbeaver.model.exec.DBExecUtils.tryExecuteRecover(DBExecUtils.java:190) at org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.run(ResultSetJobDataRead.java:121) at org.jkiss.dbeaver.ui.controls.resultset.ResultSetViewer$ResultSetDataPumpJob.run(ResultSetViewer.java:5142) at org.jkiss.dbeaver.model.runtime.AbstractJob.run(AbstractJob.java:105) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63) Caused by: java.sql.SQLException: Query failed (#20240731_174447_00251_vhqki): Max row size of 16MB exceeded: 18.80MB at io.trino.jdbc.AbstractTrinoResultSet.resultsException(AbstractTrinoResultSet.java:1937) at io.trino.jdbc.TrinoResultSet$ResultsPageIterator.computeNext(TrinoResultSet.java:294) at io.trino.jdbc.TrinoResultSet$ResultsPageIterator.computeNext(TrinoResultSet.java:254) at io.trino.jdbc.$internal.guava.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145) at io.trino.jdbc.$internal.guava.collect.AbstractIterator.hasNext(AbstractIterator.java:140) at java.base/java.util.Spliterators$IteratorSpliterator.tryAdvance(Unknown Source) at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(Unknown Source) at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(Unknown Source) at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(Unknown Source) at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(Unknown Source) at java.base/java.util.Spliterators$1Adapter.hasNext(Unknown Source) at io.trino.jdbc.TrinoResultSet$AsyncIterator.lambda$new$1(TrinoResultSet.java:179) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: io.trino.spi.TrinoException: Max row size of 16MB exceeded: 18.80MB at io.trino.plugin.exchange.filesystem.FileSystemExchangeSink$BufferedStorageWriter.write(FileSystemExchangeSink.java:291) at io.trino.plugin.exchange.filesystem.FileSystemExchangeSink.add(FileSystemExchangeSink.java:143) at io.trino.execution.buffer.SpoolingExchangeOutputBuffer.enqueue(SpoolingExchangeOutputBuffer.java:197) at io.trino.execution.buffer.SpoolingExchangeOutputBuffer.enqueue(SpoolingExchangeOutputBuffer.java:178) at io.trino.execution.buffer.LazyOutputBuffer.enqueue(LazyOutputBuffer.java:262) at io.trino.operator.output.TaskOutputOperator.addInput(TaskOutputOperator.java:163) at io.trino.operator.Driver.processInternal(Driver.java:408) at io.trino.operator.Driver.lambda$process$8(Driver.java:306) at io.trino.operator.Driver.tryWithLock(Driver.java:709) at io.trino.operator.Driver.process(Driver.java:298) at io.trino.operator.Driver.processForDuration(Driver.java:269) at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:890) at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:77) at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:191) at io.trino.$gen.Trino_448____20240731_171911_2.run(Unknown Source) at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:192) at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:168) at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:155) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1570)

`

raunaqmorarka commented 1 month ago

You can try raising the config exchange.max-page-storage-size cc: @losipiuk

losipiuk commented 1 month ago

Yeah - you may try - but there is a chance it will still not work as there can are http client/server limits set to 16MB at places.