tabular-io / iceberg-kafka-connect

Apache License 2.0
169 stars 31 forks source link

Is it okay to multiple tasks on CDC sink case with ensuring exactly-once? #228

Closed okayhooni closed 3 months ago

okayhooni commented 3 months ago

Hello, everyone..!

Debezium MySQL Source connector should operate with one task at any one time.

Then, is it okay to deploy multiple connector tasks on this iceberg sink connector to consume CDC topic (produced by Debezium source connector)?

I guess, it is okay to deploy multiple sink tasks, thanks to only one coordinator task communicating with control topic, out of multiple tasks.

Is it okay to deploy multiple tasks to sink CDC topic with exactly-one semantics?

fqtab commented 3 months ago

Yes it is fine to use multiple tasks.

Keep in mind the general kafka sink connector rule; do not use more tasks than partitions available on the CDC topic as you will just be wasting resources then (because each partition can be handled by at most one task).