redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.58k stars 582 forks source link

Schema registry creates multiple schema versions that are identical #16213

Open erikvullings opened 9 months ago

erikvullings commented 9 months ago

Version & Environment

Docker image v23.2 and v23.3 at least. This error did not occur in v22.3.25. Startup service is described below. Running in WSL2, Ubuntu 22.04, as part of a larger docker-compose file.

services:
  redpanda-0:
    command:
      - redpanda
      - start
      - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:${BROKER_PORT}
      # Address the broker advertises to clients that connect to the Kafka API.
      # Use the internal addresses to connect to the Redpanda brokers'
      # from inside the same Docker network.
      # Use the external addresses to connect to the Redpanda brokers'
      # from outside the Docker network.
      - --advertise-kafka-addr internal://redpanda-0:9092,external://localhost:${BROKER_PORT}
      - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:18082
      # Address the broker advertises to clients that connect to the HTTP Proxy.
      - --advertise-pandaproxy-addr internal://redpanda-0:8082,external://localhost:18082
      - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:${SCHEMA_PORT}
      # Redpanda brokers use the RPC API to communicate with eachother internally.
      - --rpc-addr redpanda-0:33145
      - --advertise-rpc-addr redpanda-0:33145
      # - --group_min_session_timeout_ms 6000
      # - --group_max_session_timeout_ms 300000
      # Tells Seastar (the framework Redpanda uses under the hood) to use 1 core on the system.
      - --smp 1
      # The amount of memory to make available to Redpanda.
      - --memory 1G
      # Mode dev-container uses well-known configuration properties for development in containers.
      - --mode dev-container
      # enable logs for debugging.
      - --default-log-level=debug
      # cluster level retention time in ms (90 days, default is 7 days or 604800000):
      - --set redpanda.kafka_batch_max_bytes=${MESSAGE_MAX_BYTES}
      - --set redpanda.kafka_request_max_bytes=${MESSAGE_MAX_BYTES}
      - --set redpanda.log_cleanup_policy=compact,delete
    image: docker.redpanda.com/redpandadata/redpanda:v22.3.25
    volumes:
      - redpanda:/var/lib/redpanda/data
    networks:
      - redpanda_network
    ports:
      - ${BROKER_PORT}:${BROKER_PORT}
      - ${SCHEMA_PORT}:${SCHEMA_PORT}

Docker info

Client:
 Version:    24.0.5
 Context:    default
 Debug Mode: false

Server:
 Containers: 31
  Running: 30
  Paused: 0
  Stopped: 1
 Images: 104
 Server Version: 24.0.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 nvidia runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 
 runc version: 
 init version: 
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.15.133.1-microsoft-standard-WSL2
 Operating System: Ubuntu 22.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 62.67GiB
 Name: WS-91024
 ID: 9cfb3da8-9046-40a1-aef3-9e6d9646c1c9
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: erikvullings
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support

Using python confluent client, latest version.

What went wrong?

My python services register to certain topics. When registering, it seems that the Confluent client also register this topic with the Registry service, which, admittedly, is a bit strange, as the AVRO schema was just obtained from it. In v22 and before, this did not cause any issues, since the schema was the same, so nothing changes. In v23, however, the schema is updated, and a new version is assigned to the schema (v2). So a schema now has 2 versions, and certain clients are using v1 of the schema, and others v2, leading to warnings (as some services are using the old schema). I have downloaded both versions from the schema registry as JSON, and there is no difference between them, so they should have the same version.

What should have happened instead?

Schema's that are uploaded again to the schema registry, and which are the same as the existing one, should not create a new version.

How to reproduce the issue?

It occurs in my stack consistently, but I cannot share it with you, unfortunately.

JIRA Link: CORE-1731

github-actions[bot] commented 2 weeks ago

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

erikvullings commented 2 weeks ago

Any updates on this?

rockwotj commented 1 week ago

@erikvullings this seems to a regression with schema compatibility checks. Are you able to share your schemas?