Aiven-Open / karapace

Karapace - Your Apache Kafka® essentials in one tool
https://karapace.io
Apache License 2.0
458 stars 69 forks source link

Regression in record namespace parsing between 2.1.3 and 3.0.0 #422

Closed perj closed 1 year ago

perj commented 2 years ago

What happened?

Our staging cluster in Aiven was upgraded to Karapace 3.1.2 and suddenly we get errors from the schema registry. We believe the previous version was 2.1.3, partly because of the steps to reproduce below.

We're using gogen-avro to compile Avro schemas to code. Part of its normalization is that it removes the "namespace" field from the schemas and instead put fullnames in the "name" field. This works up to Karapace 2.1.3 but no longer works in 3.0.0.

Steps to reproduce

cat > test1.avsc <<EOT
{
  "type": "record",
  "name": "baz",
  "namespace": "foo.bar",
  "fields": [
    {
      "name": "dummy",
      "type": "string"
    }
  ]
}
EOT

cat > test2.avsc <<EOT
{
  "type": "record",
  "name": "foo.bar.baz",
  "fields": [
    {
      "name": "dummy",
      "type": "string"
    }
  ]
}
EOT

sed -i -e 's/:develop/:$KARAPACE_VERSION/' container/docker-compose.yml

KARAPACE_VERSION=2.1.3 docker-compose -f container/docker-compose.yml up -d

jq -sR '{"schema":.}' < test1.avsc | curl -s -H "Content-Type: application/vnd.schemaregistry.v1+json" --data @/dev/stdin http://localhost:8081/subjects/test-value/versions|jq .
# ^- prints {"id":1}
jq -sR '{"schema":.}' < test2.avsc | curl -s -H "Content-Type: application/vnd.schemaregistry.v1+json" --data @/dev/stdin http://localhost:8081/subjects/test-value/versions|jq .
# ^- prints {"id":1}

KARAPACE_VERSION=2.1.3 docker-compose -f container/docker-compose.yml down

KARAPACE_VERSION=3.0.0 docker-compose -f container/docker-compose.yml up -d

jq -sR '{"schema":.}' < test1.avsc | curl -s -H "Content-Type: application/vnd.schemaregistry.v1+json" --data @/dev/stdin http://localhost:8081/subjects/test-value/versions|jq .
# ^- prints {"id":1}

jq -sR '{"schema":.}' < test2.avsc | curl -s -H "Content-Type: application/vnd.schemaregistry.v1+json" --data @/dev/stdin http://localhost:8081/subjects/test-value/versions|jq .
# ^- prints {"error_code":409,"message":"Incompatible schema, compatibility_mode=FULL expected: foo.bar.baz"}

What did you expect to happen?

I expect "name": "foo.bar.baz" to be full compatible with "namespace": "foo.bar", "name": "baz".

What else do we need to know?

Karapace stores whichever schema was uploaded first. In version 2.1.3 they still map to the same id though.

Confluent Schema Registry I believe always normalize to the version matching test1.avsc above. This is part of why we ran into this problem, since the uploaded version was changed server-side while we were still running CSR.

jjaakola-aiven commented 2 years ago

Fix PR is in Avro, https://github.com/apache/avro/pull/1843

ahvargas commented 2 years ago

This one is impacting more libs e.g. avsc

perj commented 2 years ago

I know this might be annoying, but I was wondering if there's any chance of a timeline for a Karapace version containing the fix? Our cluster will auto-update on Oct 4th and if that update contains the bug we'll have to start scrambling for a workaround very soon.

jjaakola-aiven commented 2 years ago

@perj The maintenance notice should be available in Aiven Console and possibility to upgrade with patched Avro.

perj commented 2 years ago

@perj The maintenance notice should be available in Aiven Console and possibility to upgrade with patched Avro.

Yes, that worked. Thanks a lot!

jjaakola-aiven commented 1 year ago

Closing this as Karapace uses Avro from Aiven fork of 1.11 branch and it has the fix in.