Open yeikel opened 3 weeks ago
You're right, there's no rate limiting built into the Couchbase connector.
You're also not the only Kafka user with this problem: https://cwiki.apache.org/confluence/display/KAFKA/KIP-731%3A+Record+Rate+Limiting+for+Kafka+Connect
That KIP was last updated in 2021, so I'm not sure if/when it will be implemented.
If you'd like to see something built into the Couchbase connector before then, you can help with the prioritization effort by taking advantage of your Couchbase Enterprise Subscription License and opening a support ticket.
We'd also welcome a PR that adds rate-limiting.
In the meantime, since you're only interested in the latest version of a document, maybe you can get the results you want by tuning Kafka's log compaction settings to use the "compact" policy.
In the meantime, since you're only interested in the latest version of a document, maybe you can get the results you want by tuning Kafka's log compaction settings to use the "compact" policy.
Thanks for the suggestion. We are using that already and it helps to an extent
As far as I know, the only settings that may be able to control the volume are
couchbase.batch.size.max
couchbase.persistence.polling.interval
couchbase.flow.control.buffer
(somewhat)tasks
But as far as I can tell, there is no easy way to say "Only insert up to N updates per N time frequency" to the topic
Would it be possible to implement this?
For context, I am only interested in the latest version of the document and sometimes the connector can publish at a faster pace than what my kafka topic can hold. I'd like to control the backlog from bot the producer and the consumer ends