couchbase / kafka-connect-couchbase

Kafka Connect connector for Couchbase Server
https://issues.couchbase.com/projects/KAFKAC
Apache License 2.0
73 stars 72 forks source link

Feature Request: Add options to control the volume of requests fetched by the source connector #47

Open yeikel opened 3 weeks ago

yeikel commented 3 weeks ago

As far as I know, the only settings that may be able to control the volume are

But as far as I can tell, there is no easy way to say "Only insert up to N updates per N time frequency" to the topic

Would it be possible to implement this?

For context, I am only interested in the latest version of the document and sometimes the connector can publish at a faster pace than what my kafka topic can hold. I'd like to control the backlog from bot the producer and the consumer ends

dnault commented 3 weeks ago

You're right, there's no rate limiting built into the Couchbase connector.

You're also not the only Kafka user with this problem: https://cwiki.apache.org/confluence/display/KAFKA/KIP-731%3A+Record+Rate+Limiting+for+Kafka+Connect

That KIP was last updated in 2021, so I'm not sure if/when it will be implemented.

If you'd like to see something built into the Couchbase connector before then, you can help with the prioritization effort by taking advantage of your Couchbase Enterprise Subscription License and opening a support ticket.

We'd also welcome a PR that adds rate-limiting.

dnault commented 3 weeks ago

In the meantime, since you're only interested in the latest version of a document, maybe you can get the results you want by tuning Kafka's log compaction settings to use the "compact" policy.

yeikel commented 3 weeks ago

In the meantime, since you're only interested in the latest version of a document, maybe you can get the results you want by tuning Kafka's log compaction settings to use the "compact" policy.

Thanks for the suggestion. We are using that already and it helps to an extent