Closed minionOfZuul closed 7 hours ago
Hi @minionOfZuul, thank you very much for your PR (#432) I'm OK with that fact that it should be possible to use the canonical representation of schemas to detect changes.
However, this cannot be the default behavior as this would be a breaking change in regards to previous versions.
In addition, the current use of a simple string comparaison is also the current strategy used by the Confluent Schema Registry (see: https://github.com/confluentinc/schema-registry/issues/1698). One of the main reason, is that an Avro schema can be enriched with any custom properties that are excluded by the canonical form. For example, some projects include metadata such as the version, the owner, tags, directly in the schema.
I would propose to update your PR to support both strategies. For doing this, we could use a dedicated new annotation (e.g., schemaregistry.jikkou.io/use-canonical-fingerprint=true
) with the default value being false
if no annotation is present in order to not introduce any breaking change. So, depending on their project needs, users can switch from one strategy to another.
Describe the bug Jikkou schema registry provider wants to change Avro schemas that don't need to change. This stems from the fact that it appears to do a string comparison between expected and actual schemas, instead of comparing by schema fingerprint.
To Reproduce schemasubject.yml
test-schema.avsc
First, create the subject.
jikkou apply -f schemasubject.yaml
Output:Next, run another apply, but with --dry-run. Note that jikkou wants to change the schema, even though it should result in
ok : 0
, notaltered : 1
.jikkou apply -f schemasubject.yaml --dry-run
Note that the schema stored in schema registry has had the
namespace
property removed from the nested enum. A namespace on a nested enum is OK according to the avro spec, but is removed during "canonicalization". The semantic meaning of the schema has not changed, but this is enough to trick Jikkou into thinking it needs to create anUPDATE
Operation instead of aNONE
.Expected behavior After the initial subject creation, subsequent invocations should not result in an update operation.
Runtime environment
Additional context Here's what I believe is happening. On schema registration, the Confluent schema registry converts the schema to "canonical" form so that it can compare the new incoming subject version to previous versions to see if they have, in fact, changed. It has its own logic to compute the "fingerprint" of the canonical form of the schema and uses that to see if the schema already exists in its storage.
Jikkou, on the other hand, seems to do a simple string comparison to determine which operation it needs to perform to go from actual state to expected state. See here. If we could teach jikkou to ignore the
schema
part of the spec and instead teach it to consider a fingerprint (can be computed using avro builtin functions) when determining which operation to build, that will solve the problem.