apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.2k stars 1.21k forks source link

Avoid re-stringifying strings in JSON_FORMAT function #13097

Closed yashmayya closed 2 weeks ago

yashmayya commented 3 weeks ago
codecov-commenter commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 62.16%. Comparing base (59551e4) to head (56f0747). Report is 413 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #13097 +/- ## ============================================ + Coverage 61.75% 62.16% +0.41% + Complexity 207 198 -9 ============================================ Files 2436 2514 +78 Lines 133233 137790 +4557 Branches 20636 21319 +683 ============================================ + Hits 82274 85657 +3383 - Misses 44911 45739 +828 - Partials 6048 6394 +346 ``` | [Flag](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | | |---|---|---| | [custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: | | [integration](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: | | [integration1](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `<0.01% <0.00%> (-0.01%)` | :arrow_down: | | [integration2](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `0.00% <0.00%> (ø)` | | | [java-11](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `35.31% <100.00%> (-26.40%)` | :arrow_down: | | [java-21](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.05% <100.00%> (+0.42%)` | :arrow_up: | | [skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.14% <100.00%> (+0.39%)` | :arrow_up: | | [skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.02% <100.00%> (+34.30%)` | :arrow_up: | | [temurin](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.16% <100.00%> (+0.41%)` | :arrow_up: | | [unittests](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `62.16% <100.00%> (+0.41%)` | :arrow_up: | | [unittests1](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `46.84% <100.00%> (-0.05%)` | :arrow_down: | | [unittests2](https://app.codecov.io/gh/apache/pinot/pull/13097/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `27.74% <0.00%> (+0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jackie-Jiang commented 3 weeks ago

I don't fully follow the example in the description. If the value is already JSON string, user shouldn't call JSON_FORMAT again

yashmayya commented 3 weeks ago

I'm not sure if this is the correct behavior. JSON_FORMAT should be able to convert any object into JSON. If the value itself is abc, then the proper JSON version should be "abc".

That makes sense, I hadn't considered that.

If user wants to generate another level of json over a JSON string, we should still allow that, and the new JSON is also valid

Are there any valid use cases for that?

I don't fully follow the example in the description. If the value is already JSON string, user shouldn't call JSON_FORMAT again

Yeah, I agree in principle but this leads to the confusing behavior that I tried to document in the PR description. Let me try again below.


Scenario 1

Scenario 2


Is this working as expected and documented somewhere? Or should we solve this issue in a different way to what this PR was attempting?

Jackie-Jiang commented 3 weeks ago

In the given example, the problem is actually from the recursive call of data = JSON_FORMAT(data). I don't think Pinot allows this (maybe I was wrong). It should work as expected if we configure the ingestion transform to be: