StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.03k stars 1.82k forks source link

[Enhancement] fix connector mem scan limit adjustment when no chunk source #53112

Open dirtysalt opened 6 hours ago

dirtysalt commented 6 hours ago

Why I'm doing:

This is the further improvement on this PR: https://github.com/StarRocks/starrocks/pull/50686

We find a bad case that

And this case can be reproduced by following SQL

set cbo_cte_reuse = false;
with ICE as (
 select lo_orderkey from iceberg.zz_iceberg_ssb_sf100_iceberg_parquet_lz4.lineorder_flat
),
MYSQL(id_int) as (
  select id_int from default_catalog.zya.ext_mysql where id_varchar = 'USA'
),
RESULT(x) as (
select lo_orderkey from ICE
UNION ALL (select lo_orderkey from ICE inner join [broadcast] MYSQL on ICE.lo_orderkey = MYSQL.id_int)
UNION ALL (select lo_orderkey from ICE inner join [broadcast] MYSQL on ICE.lo_orderkey + 1= MYSQL.id_int)
UNION ALL (select lo_orderkey from ICE inner join [broadcast] MYSQL on ICE.lo_orderkey + 2= MYSQL.id_int)
UNION ALL (select lo_orderkey from ICE inner join [broadcast] MYSQL on ICE.lo_orderkey + 3= MYSQL.id_int)
UNION ALL (select lo_orderkey from ICE inner join [broadcast] MYSQL on ICE.lo_orderkey + 4= MYSQL.id_int)
)
select count(distinct x) from RESULT;

So the execution profile looks like this

image

And if you look at PeakIOTasks of ICE table, it's very low probably like 3-4, which is bad.


The root cause is, since we have this PR: https://github.com/StarRocks/starrocks/pull/50686

However, there is a corner case that: if the scan operator does not create any chunk source, it has no chance to adjust it's chunk usage back to 0. And it affects other scan operator's available mem limit, which leads to low io tasks.

What I'm doing:

This PR is to:

Fixes #50686

What type of PR is this:

Does this PR entail a change in behavior?

If yes, please specify the type of change:

Checklist:

Bugfix cherry-pick branch check:

sonarcloud[bot] commented 6 hours ago

Quality Gate Failed Quality Gate failed

Failed conditions
7.4% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

github-actions[bot] commented 5 hours ago

[BE Incremental Coverage Report]

:x: fail : 8 / 11 (72.73%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: be/src/exec/pipeline/fragment_executor.cpp 0 3 00.00% [241, 242, 243]
:large_blue_circle: be/src/exec/pipeline/scan/connector_scan_operator.cpp 7 7 100.00% []
:large_blue_circle: be/src/exec/pipeline/query_context.cpp 1 1 100.00% []
github-actions[bot] commented 3 hours ago

[Java-Extensions Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)