apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.2k stars 1.21k forks source link

Custom configuration property reader for segment metadata files #12440

Open abhioncbr opened 3 months ago

abhioncbr commented 3 months ago

This is the first set of changes for the issue.

In this PR, we introduce a new custom property reader for the segment metadata files. This custom property reader is based on the commons-configuration2 custom property reader/writer(here is the link). We are introducing this custom property reader to avoid unescaping operation happen for the key in default implementation.

cc: @klsince @Jackie-Jiang @itschrispeck

codecov-commenter commented 3 months ago

Codecov Report

Attention: Patch coverage is 86.36364% with 9 lines in your changes are missing coverage. Please review.

Project coverage is 62.16%. Comparing base (59551e4) to head (fbd5450). Report is 522 commits behind head on master.

Files Patch % Lines
...pache/pinot/spi/env/CommonsConfigurationUtils.java 85.71% 4 Missing and 1 partial :warning:
...rg/apache/pinot/spi/env/PropertyIOFactoryKind.java 70.00% 3 Missing :warning:
.../apache/pinot/spi/env/VersionedPropertyReader.java 91.66% 0 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #12440 +/- ## ============================================ + Coverage 61.75% 62.16% +0.41% + Complexity 207 198 -9 ============================================ Files 2436 2540 +104 Lines 133233 139455 +6222 Branches 20636 21554 +918 ============================================ + Hits 82274 86697 +4423 - Misses 44911 46265 +1354 - Partials 6048 6493 +445 ``` | [Flag](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | | |---|---|---| | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: | | [integration](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: | | [integration1](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: | | [integration2](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (ø)` | | | [java-11](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.12% <86.36%> (+0.41%)` | :arrow_up: | | [java-21](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.05% <86.36%> (+0.42%)` | :arrow_up: | | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.15% <86.36%> (+0.41%)` | :arrow_up: | | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.02% <86.36%> (+34.29%)` | :arrow_up: | | [temurin](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.16% <86.36%> (+0.41%)` | :arrow_up: | | [unittests](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.16% <86.36%> (+0.41%)` | :arrow_up: | | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.75% <86.36%> (-0.14%)` | :arrow_down: | | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/12440/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.72% <0.00%> (-0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

abhioncbr commented 2 months ago

Can we add a test to ensure it can read the old format properly? We can put a old format file into the resource folder, and use the new reader to read it

Added the required tests.

abhioncbr commented 2 months ago

@Jackie-Jiang / @klsince, any further comments on the implementation side. I am going to start work on it's usage based on the cluster configuration or table configuration.

abhioncbr commented 2 months ago

Based on the offline discussion, we decided to have pair of separate custom segment metadata reader/writer. We will decide to use custom reader/writer based on the version of the segment metadata.

abhioncbr commented 1 month ago

@Jackie-Jiang / @klsince, please review it once more. Thanks