tulios / kafkajs

A modern Apache Kafka client for node.js
https://kafka.js.org
MIT License
3.75k stars 527 forks source link

Topic metadata prefetching #1615

Open mkocsar opened 1 year ago

mkocsar commented 1 year ago

Is your feature request related to a problem? Please describe. When a message is being sent to a topic which the KafkaJS Producer hasn't seen before, the topic metadata cache is refreshed which adds to the latency. We are using KafkaJS in a platform where latency is an important concern and producers send messages to thousands of topics, which leads to frequent metadata refreshes and increased latency for those events.

We created a POC where the producer prefetched the metadata for all topics in the cluster before sending any messages and the measured performance improvement was significant:



The script used for measurement can be found at mkocsar/test-prefetch-impact.js (topic setup and teardown scripts at mkocsar/test-send-duration.js). Please note that Improve refreshMetadataIfNecessary in BrokerPool is required to take advantage of metadata prefetching.

Describe the solution you'd like The way we triggered metadata prefetching in the POC relies on KafkaJS-internal API (cluster.addMultipleTargetTopics) which is normally not accessible by clients. We propose extending the KafkaJS Producer API in a way which allows for prefetching and on-demand refreshing metadata for a given set of topics or all topics of the cluster. The added value would be performance improvements for latency-critical users.