Closed mberastau closed 4 years ago
The schema registry is no longer part of spring-cloud-stream project and is managed independently
Hi @mberastau, thanks for rising this case!
It appears that the purpose of the readerSchema
property is indeed to override the normal schema lookup process and always use the hardcoded schema value.
spring.cloud.schema.avro.readerSchema Avro compares schema versions by looking at a writer schema (origin payload) and a reader schema (your application payload). See the Avro documentation for more information. If set, this overrides any lookups at the schema server and uses the local schema as the reader schema.
I'm trying to find out the reasons that brought in the readerSchema
. Could you please elaborate your use case? Why would you need hardcoded readerSchema?
Ok, i have much better understanding about the role of setting explicitly the readerSchema.
After further exploring the role of the readerSchema and putting together tests to verify the issue,
I can confirm that this is indeed a miss-behavior and I agree with the @mberastau suggestion:
unwrap the 1st condition code and always try to resolve the writer schema through schema registry first and only then, if not successful, default to the reader one.
The PR is coming soon.
At the moment if a reader schema is defined the converter will always use it as a writer schema, ignoring the actual writer schema. That leads to violation of the Avro schema resolution (https://avro.apache.org/docs/1.7.6/spec.html#Schema+Resolution) and as a result - an exception.
From org.springframework.cloud.stream.schema.avro.AvroSchemaRegistryClientMessageConverter:
Example If the producer changes its schema from v1 to v2 by introducing a new field, then, according to the schema resolution, this field should be ignored (assuming the consumer's schema hasn't changed and stayed v1):
With the current implementation the consumer app won't event reach the schema resolution step - it will fail during deserialization, because it will not be able to correctly read binary data of a message v2 using the schema v1.
Potential fix would be to unwrap the 1st condition code and always try to resolve the writer schema through schema registry first and only then, if not successful, default to the reader one.