StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.74k stars 1.75k forks source link

[Enhancement] optimize the performance of refreshExternalTable stage of MV refresh #47809

Closed murphyatwork closed 2 months ago

murphyatwork commented 3 months ago

Why I'm doing:

For MV on DataLake, MetadataMgr::refreshTable call can be costly if the external table has a lot of partitions.

To optimize it, we can optimize that call into a MetadataMgr::refreshTable(partitionNames), so we don't need to refresh all partitions each time.

Experimental result:

What I'm doing:

Fixes #issue

What type of PR is this:

Does this PR entail a change in behavior?

If yes, please specify the type of change:

Checklist:

Bugfix cherry-pick branch check:

sonarcloud[bot] commented 2 months ago

Quality Gate Passed Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

github-actions[bot] commented 2 months ago

[FE Incremental Coverage Report]

:white_check_mark: pass : 55 / 61 (90.16%)

file detail

path covered_line new_line coverage not_covered_line_detail
:large_blue_circle: com/starrocks/scheduler/mv/MVPCTRefreshListPartitioner.java 1 3 33.33% [280, 281]
:large_blue_circle: com/starrocks/scheduler/TableSnapshotInfo.java 6 8 75.00% [79, 82]
:large_blue_circle: com/starrocks/scheduler/PartitionBasedMvRefreshProcessor.java 39 41 95.12% [256, 259]
:large_blue_circle: com/starrocks/common/Config.java 1 1 100.00% []
:large_blue_circle: com/starrocks/scheduler/mv/MVPCTRefreshRangePartitioner.java 8 8 100.00% []
github-actions[bot] commented 2 months ago

[BE Incremental Coverage Report]

:white_check_mark: pass : 0 / 0 (0%)

github-actions[bot] commented 2 months ago

@Mergifyio backport branch-3.3

github-actions[bot] commented 2 months ago

@Mergifyio backport branch-3.2

github-actions[bot] commented 2 months ago

@Mergifyio backport branch-3.1

mergify[bot] commented 2 months ago

backport branch-3.3

✅ Backports have been created

* [#47989 [Enhancement] optimize the performance of refreshExternalTable stage of MV refresh (backport #47809)](https://github.com/StarRocks/starrocks/pull/47989) has been created for branch `branch-3.3`
mergify[bot] commented 2 months ago

backport branch-3.2

✅ Backports have been created

* [#47990 [Enhancement] optimize the performance of refreshExternalTable stage of MV refresh (backport #47809)](https://github.com/StarRocks/starrocks/pull/47990) has been created for branch `branch-3.2` but encountered conflicts
mergify[bot] commented 2 months ago

backport branch-3.1

✅ Backports have been created

* [#47992 [Enhancement] optimize the performance of refreshExternalTable stage of MV refresh (backport #47809)](https://github.com/StarRocks/starrocks/pull/47992) has been created for branch `branch-3.1` but encountered conflicts