redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.57k stars 582 forks source link

Can't create schema with union null and reccord #5291

Open dnahurnyi opened 2 years ago

dnahurnyi commented 2 years ago

Version & Environment

Redpanda version: v21.11.2 and v22.1.4: Redpanda started as a docker container using docker-compose to run integrations tests.

What went wrong?

When I run the test with venom to test a consumer of a topic, I use the next files and a schema and message: schema.avsc:

{
    "name": "SomeObject",
    "type": "record",
    "namespace": "some.namespace",
    "fields": [
        {
            "name": "after",
            "type": [
                "null",
                {
                    "name": "Value",
                    "type": "record",
                    "fields": [
                        {
                            "name": "id",
                            "type": "string"
                        },
                        {
                            "name": "account_id",
                            "type": "string"
                        }
                    ]
                }
            ],
            "default": null
        },
        {
            "name": "op",
            "type": "string"
        }
    ]
}

message.json:

{
  "after": {
    "id": "23",
    "account_id": "23"
  },
  "op": "c"
}

and test fails with message:

Testcase "Doesn't matter", step #2: Assertion "result.err ShouldBeEmpty" failed. expected 'can't convert value 2 avro with schema: failed to convert value {
  "after": {
    "id": "23",
    "account_id": "23"
  },
  "op": "c"
} 2 native Avro: cannot decode textual record "some.namespace.SomeObject": cannot decode textual union: cannot decode textual map: cannot determine codec: "id" for key: "after"' to be empty but it wasn't

Also, when I query the schema from redpanda I see no data. Query:

curl --retry 30 --retry-connrefused --retry-max-time 5 --connect-timeout 10 -s http://localhost:8081/subjects/

What should have happened instead?

Schema should appear in redpanda and avro should be able to convert message using schema.

How to reproduce the issue?

  1. Start Redpanda
  2. Run venom tests, example: https://github.com/ovh/venom/blob/master/executors/kafka/README.md
  3. Get the issue.

Additional information

Also, when I change the type of fields after from union to just objecyt (in that case objetc should always be present) everything works fine. Also, in that case, I can see the new schema using query (I have mentioned it in the end of What went wrong? section). So my guess is that Redpanda doesn't handle unions with null and record.

JIRA Link: CORE-960

BenPope commented 1 year ago

This is a strange one. I tried:

docker run -it -p 8081:8081 vectorized/redpanda:v22.1.4 redpanda start --cpu 1

And then:

curl -X POST "http://127.0.0.1:8081/subjects/test7/versions?normalize=true" -H 'Content-type: application/vnd.schemaregistry.v1+json' -d '{"schemaType":"AVRO","schema":"{\"name\":\"SomeObject\",\"type\":\"record\",\"namespace\":\"some.namespace\",\"fields\":[{\"name\":\"after\",\"type\":[\"null\",{\"name\":\"Value\",\"type\":\"record\",\"fields\":[{\"name\":\"id\",\"type\":\"string\"},{\"name\":\"account_id\",\"type\":\"string\"}]}],\"default\":null},{\"name\":\"op\",\"type\":\"string\"}]}"}'
{"id":1}

And then:

curl -s "http://127.0.0.1:8081/subjects"
["test7"]

It seemed to work.

I'll dig in to how venom works.