Open morsapaes opened 2 months ago
Currently when using FORMAT avro
we will publish the generated key and value schemas to the schema registry (if a key is provided).
What should happen in the case of KEY FORMAT json VALUE FORMAT avro
? Should we publish the key JSON schema to the registry too? Or should we just publish the value avro schema and leave the key schema unset in the registry?
Similarly, what about KEY FORMAT avro VALUE FORMAT json
? And KEY FORMAT text VALUE FORMAT avro
? It looks like our schema registry crate allows specifying either avro, proto, or json schemas, but not text/bytes.
What should happen in the case of
KEY FORMAT json VALUE FORMAT avro
? Should we publish the key JSON schema to the registry too? Or should we just publish the value avro schema and leave the key schema unset in the registry?
The {KEY|VALUE} FORMAT
is designed to tell you whether or not to use the schema registry. At least it works like this for sources. You may need to bang on it to get to parity for sinks.
With Avro, you have to specify a USING
option to specify schema behavior. But, for sources, you're not limited to just the CSR! You can also provide the schema inline:
KEY VALUE FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY
KEY VALUE FORMAT AVRO USING SCHEMA '<inline schema>'
So, turning back to JSON, the vision here is this:
# The only thing we support today.
KEY VALUE FORMAT JSON
# Something we'll add support for eventually.
KEY VALUE FORMAT JSON USING CONFLUENT SCHEMA REGISTRY ...
Adding support for JSON + CSR is tracked in https://github.com/MaterializeInc/materialize/issues/7186. Recommend you don't go down that road now! Mapping Materialize relations to JSON schema is nontrivial.
Feature request
As is, we don't support specifying different formats for the key and value of sinked Kafka records. This is inconsistent with the semantics of Kafka sources (#20135), and prevents users from opting out of using complex types for the key (which has known issues in itself). We should introduce the
KEY FORMAT
/VALUE FORMAT
options also for sinks, to allow emittingtext
andbytea
keys in sinked Kafka records.Original ask (Slack)
Note: might be a good one to bundle up with #23925.