WINDOW Function OVER PARTITION BY Query on a fairly small dataset (# of docs post filter scan = 433) runs forever until it gets killed/timed out after configured timeout of 200s #12391
The following query which uses OVER PARTITION BY on a fairly small dataset (# of docs post filter scan = 433) runs forever until it gets killed/timed out after configured timeout of 200s.
`SET useMultistageEngine=true;
SET maxRowsInJoin=100000000;
SET numGroupsLimit=2000000;
SET timeoutMs=200000;
SELECT
s.,
s.visits / SUM(visits) OVER(PARTITION BY s.ali_universe_id) 100 as own_contrib_percent
FROM
rei_thasos_stat_pai_v2 s
WHERE
s.ali_universe_id IN (SELECT ali_universe_id FROM ali_universe_v1 WHERE str_ali IN ('1068339987-000000') AND is_traffic_hidden = false)
AND s.date_type = 'month'
AND s.start_date_string = '2023-09-01'`
Stack-trace:
`Error Code: 200
QueryExecutionError:
Received error query execution result block: {250=ExecutionTimeoutError
ProcessingException(errorCode:250, message:ExecutionTimeoutError)
at org.apache.pinot.common.exception.QueryException.(QueryException.java:113)
at org.apache.pinot.common.datablock.DataBlockUtils.extractErrorMsg(DataBlockUtils.java:50)
at org.apache.pinot.common.datablock.DataBlockUtils.getErrorDataBlock(DataBlockUtils.java:42)
at org.apache.pinot.query.runtime.blocks.TransferableBlockUtils.getErrorTransferableBlock(TransferableBlockUtils.java:47)}
org.apache.pinot.query.service.dispatch.QueryDispatcher.getResultTable(QueryDispatcher.java:266)
org.apache.pinot.query.service.dispatch.QueryDispatcher.runReducer(QueryDispatcher.java:206)
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:95)
org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:191)`
@Jackie-Jiang
The following query which uses OVER PARTITION BY on a fairly small dataset (# of docs post filter scan = 433) runs forever until it gets killed/timed out after configured timeout of 200s.
`SET useMultistageEngine=true; SET maxRowsInJoin=100000000; SET numGroupsLimit=2000000; SET timeoutMs=200000;
SELECT s., s.visits / SUM(visits) OVER(PARTITION BY s.ali_universe_id) 100 as own_contrib_percent FROM rei_thasos_stat_pai_v2 s WHERE s.ali_universe_id IN (SELECT ali_universe_id FROM ali_universe_v1 WHERE str_ali IN ('1068339987-000000') AND is_traffic_hidden = false) AND s.date_type = 'month' AND s.start_date_string = '2023-09-01'`
Stack-trace: `Error Code: 200
QueryExecutionError: Received error query execution result block: {250=ExecutionTimeoutError ProcessingException(errorCode:250, message:ExecutionTimeoutError) at org.apache.pinot.common.exception.QueryException.(QueryException.java:113)
at org.apache.pinot.common.datablock.DataBlockUtils.extractErrorMsg(DataBlockUtils.java:50)
at org.apache.pinot.common.datablock.DataBlockUtils.getErrorDataBlock(DataBlockUtils.java:42)
at org.apache.pinot.query.runtime.blocks.TransferableBlockUtils.getErrorTransferableBlock(TransferableBlockUtils.java:47)}
org.apache.pinot.query.service.dispatch.QueryDispatcher.getResultTable(QueryDispatcher.java:266)
org.apache.pinot.query.service.dispatch.QueryDispatcher.runReducer(QueryDispatcher.java:206)
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:95)
org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:191)`
JFR: downloadable here. JFR screenshot attached.