provectus / kafka-ui

Open-Source Web UI for Apache Kafka Management
Apache License 2.0
9.78k stars 1.18k forks source link

Glue serde avro to json deserialization includes namespaces and union types #3237

Open Ronserruya opened 1 year ago

Ronserruya commented 1 year ago

Originally reported in #3224 , split into a separate issue following the discussion in #3235

When the glue serde deserializes to json from avro, it includes the record namespaces and types in the case of union. This is the first time I'm encountering the behaviour since the python deserializer or the one used in kafka-connect don't follow this behavior

Example:

Original msg:

{"name": {"first": "ron", "last": "serruya", "full": "ron serruya"}, "ids1": [5,6], "ids2": ["abc", 123]}

schema used:

{
  "type": "record",
  "name": "generation",
  "namespace": "top_level",
  "fields": [
    {
      "name": "name",
      "type": [
        {
          "type": "record",
          "name": "name",
          "namespace": "top_level.generation",
          "fields": [
            {
              "name": "raw",
              "type": [
                "string",
                "null"
              ]
            },
            {
              "name": "first",
              "type": "string"
            },
            {
              "name": "full",
              "type": "string"
            },
            {
              "name": "last",
              "type": ["string"]
            }
          ]
        },
        "null"
      ]
    },
    {
      "name": "ids1",
      "type": {"type": "array", "items": "int"}
    },
    {
      "name": "ids2",
      "type": {"type": "array", "items": ["string", "int"]}
    }
  ]
}

base64 encoded avro msg (just the msg, without the glue-related bytes at the start) AAIGcm9uFnJvbiBzZXJydXlhAA5zZXJydXlhBAoMAAQABmFiYwL2AQA=

The current glue deserializer shows this msg as:

{
  "name": {
    "top_level.generation.name": {
      "raw": null,
      "first": "ron",
      "full": "ron serruya",
      "last": {
        "string": "serruya"
      }
    }
  },
  "ids1": [
    5,
    6
  ],
  "ids2": [
    {
      "string": "abc"
    },
    {
      "int": 123
    }
  ]
}

As you can see it adds string, int, or the record namespace top_level.generation.name

I fixed this issue locally by adding this line: encoder.setIncludeNamespace(false); in the avroRecordToJson method

But according to the comment in #3235 , that's not a completely valid fix since it can break other stuff?

Before and after the fix:

Screen Shot 2023-01-15 at 15 48 52 Screen Shot 2023-01-15 at 15 45 44
Haarolean commented 1 year ago

Hey, thanks, we'll take a look.

S1M0NM commented 1 year ago

This seems to also apply to the default SchemaRegistry Serde.

Is there a way i can fix this for the included SchemaRegistry serde?

frankgrimes97 commented 1 year ago

@Haarolean Any update on this? We're also being affected by this odd display behavior and would like to see it fixed.

Haarolean commented 1 year ago

@frankgrimes97 planned for 0.8

iliax commented 1 year ago

Reopening, since it is only fixed for Kafka schema registry, not glue

frankgrimes97 commented 8 months ago

@Haarolean Any update on when we might see a fix and 0.8 release?

Haarolean commented 7 months ago

@Haarolean Any update on when we might see a fix and 0.8 release?

@frankgrimes97 https://github.com/provectus/kafka-ui/discussions/4255 https://github.com/kafbat/kafka-ui/discussions/23