apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.52k stars 1.29k forks source link

Upsert small segment merger task in minions #14477

Open tibrewalpratik17 opened 4 days ago

tibrewalpratik17 commented 4 days ago

PR related to the PEP request: #14305

Here, we are adding a new minion task to merge small segments in an upsert table. More implementation details in the design doc of the linked issue.

Test plan: Enabled this in one of infinite retention tables in Uber. The tables had ~35k segments initially and after enabling this task for ~2 days we were able to reach ~2k segments. The curve also flattens post reaching ~2k segments. We are using the default configs of this task and the table is generating ~500 segments daily. See attached screenshot.

Screenshot 2024-11-21 at 4 28 22 PM

Few details:

codecov-commenter commented 4 days ago

Codecov Report

Attention: Patch coverage is 11.37931% with 257 lines in your changes missing coverage. Please review.

Project coverage is 63.67%. Comparing base (59551e4) to head (64cd7d6). Report is 1358 commits behind head on master.

Files with missing lines Patch % Lines
...tcompactmerge/UpsertCompactMergeTaskGenerator.java 12.95% 164 Missing and 4 partials :warning:
...rtcompactmerge/UpsertCompactMergeTaskExecutor.java 0.00% 75 Missing :warning:
...ctmerge/UpsertCompactMergeTaskExecutorFactory.java 0.00% 6 Missing :warning:
...t/processing/framework/SegmentProcessorConfig.java 44.44% 4 Missing and 1 partial :warning:
...UpsertCompactMergeTaskProgressObserverFactory.java 0.00% 2 Missing :warning:
.../org/apache/pinot/core/common/MinionConstants.java 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #14477 +/- ## ============================================ + Coverage 61.75% 63.67% +1.92% - Complexity 207 1577 +1370 ============================================ Files 2436 2671 +235 Lines 133233 146730 +13497 Branches 20636 22508 +1872 ============================================ + Hits 82274 93434 +11160 - Misses 44911 46376 +1465 - Partials 6048 6920 +872 ``` | [Flag](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | | |---|---|---| | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `100.00% <ø> (+99.99%)` | :arrow_up: | | [integration](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `100.00% <ø> (+99.99%)` | :arrow_up: | | [integration1](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `100.00% <ø> (+99.99%)` | :arrow_up: | | [integration2](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <ø> (ø)` | | | [java-11](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `63.65% <11.37%> (+1.94%)` | :arrow_up: | | [java-21](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `63.53% <11.37%> (+1.91%)` | :arrow_up: | | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `63.67% <11.37%> (+1.92%)` | :arrow_up: | | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `63.51% <11.37%> (+35.78%)` | :arrow_up: | | [temurin](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `63.67% <11.37%> (+1.92%)` | :arrow_up: | | [unittests](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `63.67% <11.37%> (+1.92%)` | :arrow_up: | | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `55.56% <57.14%> (+8.67%)` | :arrow_up: | | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/14477/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `34.01% <8.62%> (+6.28%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

tibrewalpratik17 commented 3 days ago

Marking it ready for review for early feedback!