StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.03k stars 1.82k forks source link

[Enhancement] Ignore union type tag when converting avro to json (backport #52973) #53100

Closed mergify[bot] closed 19 hours ago

mergify[bot] commented 19 hours ago

Why I'm doing:

schema:

 {
    "type": "record",
    "name": "User",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "name", "type": "string"},
        {"name": "email", "type": ["null",
                                   {
                                       "type": "record",
                                       "name": "email2",
                                       "fields": [
                                           {
                                               "name": "x",
                                               "type" : ["null", "int"]
                                           },
                                           {
                                               "name": "y",
                                               "type": ["null", "string"]
                                           }
                                       ]
                                   }
                                  ]
         }
    ]
 }

avro avro_value_to_json result: {"id": 1, "name": "Alice", "email": {"email2": {"x": {"int": 1}, "y": {"string": "alice@example.com"}}}}

What I'm doing:

add a new function to convert avro values to JSON strings while ignoring union type tags. {"id":1,"name":"Alice","email":{"x":1,"y":"alice@example.com"}}

add a new config avro_ignore_union_type_tag and modify existing functions to use this new conversion method based on the config.

Fixes #issue

What type of PR is this:

Does this PR entail a change in behavior?

If yes, please specify the type of change:

Checklist:

Bugfix cherry-pick branch check:

add a new function to convert avro values to JSON strings while ignoring union type tags. {"id":1,"name":"Alice","email":{"x":1,"y":"alice@example.com"}}

add a new config avro_ignore_union_type_tag and modify existing functions to use this new conversion method based on the config.

Fixes #issue

What type of PR is this:

Does this PR entail a change in behavior?

If yes, please specify the type of change:

Checklist: