Open klalafaryan opened 4 years ago
Is it possible to use json-schema (https://json-schema.org/) with kafka-connect-jdbc instead of kafka connect schema ?
The JDBC connector always uses connect schema objects, so if you want to store data in Kafka with json-schema, then you will need to write your own Converter implementation that serializes and deserializes this format.
Is it possible to create the schema dynamically from POJO ?
Yes this is possible, and we generally call this technique schema inferencing. Doing it in general is a little bit involved, and it's up to you to decide if investing in dynamic schema generation vs updating the schema yourself is more cost effective.
I found a related guide that statically builds the schema for the streams application: https://kafka-tutorials.confluent.io/changing-serialization-format/kstreams.html
Is it possible to validate the POJO with the schema before producing ? I'd assume that any reasonable schema library would give you the tools to validate objects, but it's technically up to implementation. Your Converter/Serde implementation would need to implement that behavior, since Kafka/Connect won't be able to recognize a data/schema mismatch.
Overall, I think that unless you have existing infrastructure using json-schema, and are willing to implement and maintain custom Converters/Serdes, you're better off choosing an off-the-shelf serialization format that's already supported, such as Json with schemas (JsonConverter), or Avro (AvroConverter).
JSONSchema converters are now included in Schema Registry + Confluent Platform
Hello,
Context: We are trying to build following architecture with Kafka Connect and Kafka Streams
MYSQL -> KAFKA-CONNECT-JDBC (SOURCE connector) -> KAFKA -> KAFKA-STREAMS (doing some normalizations) -> KAFKA -> KAFKA-CONNECT-JDBC (SINK connector) -> POSTGRES
So I have following questions:
Is it possible to use json-schema (https://json-schema.org/) with kafka-connect-jdbc instead of kafka connect schema ?
To be able to sink the data into POSTGRES the kafka connect jdbc sink requires the schema, and for this we have to produce the schema with payload from kafka-streams.
So we create a JAVA POJO (or JsonNode) and a schema separately in the kafka streams.
Thanks a lot for your input.