Describe the issue
The regex for versionedSchema in AvroSchemaRegistryClientMessageConverter has an expectation that the subject part of the schema is alphanumeric and does not allow for the fact that the Confluent schema registry is case sensitive, whereas MimeType converts to lowercase.
We have implemented a custom org.springframework.cloud.stream.schema.avro.SubjectNamingStrategy which is driven by enterprise requirements to have a certain prefix to the registered schema name, which contains non-alphanumeric characters and some mixed case formatting.
For example a schema named by default as "foobar" would be registered as SharedKafka_1234.foobar-value. When converted to a MimeType this is application/vnd.sharedkafka_1234.foobar-value.v4+avro which fails the regex check in AvroSchemaRegistryClientMessageConverter here:
meaning the schema version is never extracted. Furthermore even if the schema reference can be extracted, it would subsequently fail a lookup in the schemaRegistryClient due to the case sensitivity issue.
This prevents us evolving our schemas, as the local schema is then used, which is incompatible with the incoming message
To Reproduce
Register a schema version "n" with a custom SubjectNamingStrategy which includes a non-alphanumeric character
Evolve the schema to "n+1" by adding an optional field
Produce a message using a custom SubjectNamingStrategy which includes a non-alphanumeric character with schema version "n+1"
Attempt to consume the message with a consumer using schema version "n"
Observe stacktrace, similar to
Caused by: java.lang.ArrayIndexOutOfBoundsException: 24
at org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:460) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:283) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:178) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) ~[avro-1.9.0.jar:1.9.0]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144) ~[avro-1.9.0.jar:1.9.0]
at org.springframework.cloud.stream.schema.avro.AbstractAvroMessageConverter.convertFromInternal(AbstractAvroMessageConverter.java:105) ~[spring-cloud-stream-schema-2.1.3.RELEASE.jar:2.1.3.RELEASE]
Version of the framework
2.1.3-RELEASE
Expected behavior
The regex is overrideable, and the schema registry client takes case into account case-sensitivity (due to MimeType restrictions and Confluent's case sensitivity)
Screenshots
Additional context
Add any other context about the problem here.
Describe the issue The regex for versionedSchema in AvroSchemaRegistryClientMessageConverter has an expectation that the subject part of the schema is alphanumeric and does not allow for the fact that the Confluent schema registry is case sensitive, whereas MimeType converts to lowercase.
We have implemented a custom org.springframework.cloud.stream.schema.avro.SubjectNamingStrategy which is driven by enterprise requirements to have a certain prefix to the registered schema name, which contains non-alphanumeric characters and some mixed case formatting.
For example a schema named by default as "foobar" would be registered as SharedKafka_1234.foobar-value. When converted to a MimeType this is
application/vnd.sharedkafka_1234.foobar-value.v4+avro
which fails the regex check in AvroSchemaRegistryClientMessageConverter here:meaning the schema version is never extracted. Furthermore even if the schema reference can be extracted, it would subsequently fail a lookup in the schemaRegistryClient due to the case sensitivity issue.
This prevents us evolving our schemas, as the local schema is then used, which is incompatible with the incoming message
To Reproduce
Version of the framework 2.1.3-RELEASE Expected behavior The regex is overrideable, and the schema registry client takes case into account case-sensitivity (due to MimeType restrictions and Confluent's case sensitivity)
Screenshots
Additional context Add any other context about the problem here.