Closed imaffe closed 1 year ago
key method: addSchemaIfIdleOrCheckCompatible in pulsar schema registry service
Root cause might be :https://github.com/apache/pulsar/issues/17354
When there is a producer: (!producers.isEmpty()) will be true and will enter the check cycle. I guess BYTES is considered as no schema.
Then current behaviour makes sense. Because autoSchemaUpdate requires the topic has no data. We need to redesign the PulsarTableSerializationSchema. Maybe something similar to the source evolution.
Either we ask pulsar to support reading bytes by using a designated schema, or we upload the schema info to Pulsar broker from Flink side
We can ask users to upload the schema before starting sending any data in. This is a workaround.
The major challenge is to set of propoer schema info when producing to Pulsar topics from Flink sink connector. In SQL sink we don't have a POJO class and the serialization is managed by Flink formats, if we want to setup the pulsar schema correctly, we need to map a Flink serializationSchema to a Pulsar schema which is very inconvenient. For avro and raw formats it's doable, for json formats we don't know how to generate a Pulsar JSON schema from the Flink RowType.
I'll mark this issue as pending (icebox) and will note this down in the documentation. Should also give a try when the related pulsar fixes are applied.
Same as #128, decrease the story points
in writeToExplicitTableAndReadWithJsonSchemaUsingPulsarConsumer test case, we have a
This is because when writing the data to Pulsar topic, we use a byte schema. The pulsar consumer will check the schema compatibility issues. We need to understand if this can changed using a different configuration entry (enable schema evolution feature)