Open ozgune opened 8 years ago
We received questions about Kafka and the Bottled Water extension from three users recently; and I wanted to capture some of that context in this issue.
The primary motivation for Kafka integration could be one of the following:
We hear Kafka integration across these use cases and also have customers who set up their Kafka <> Citus pipelines. The second and third use cases relate to change data capture (CDC).
For the third item, we could integrate with Bottled Water or another extension such as Debezium.
Bottled Water is unmaintained. Kafka Connect JDBC Connector cannot be used with Citus because of prepared statements. Does https://debezium.io/ work?
What're the other solutions for sinking kafka to citus?
@fi0 not sure if this is the use case you have in mind, but to copy JSON messages from a Kafka topic into a Postgres table you can use the kafka-sink-pg-json tool. There's an example in our docs: http://docs.citusdata.com/en/stable/develop/integrations.html#ingesting-data-from-kafka
thank you @jonels-msft https://github.com/justonedb/kafka-sink-pg-json doesn't seem to be well maintained though.
The CDC feature for distributed tables (Preview) is availabe in Citus 11.3 release. Please find the release notes on CDC here: https://www.citusdata.com/updates/v11-3/#cdc_support
Most Citus customers use a Kafka queue before they ingest data into the database. We need to investigate their use and have a better integration story between Kafka and Citus.
Kafka uses the Java runtime. This task may therefore relate to #4.