mozilla-services / heka

DEPRECATED: Data collection and processing made easy.
http://hekad.readthedocs.org/
Other
3.39k stars 531 forks source link

How to specify the batch size in KafkaOutput? #1889

Open frankyaorenjie opened 8 years ago

frankyaorenjie commented 8 years ago

RT, I cannot figure it out how to set the batch size in KafkaOutput and this is an important parameter in tuning Kafka producers' performance.

wangfeiping commented 8 years ago

https://github.com/mozilla-services/heka/blob/dev/docs/source/config/outputs/kafka.rst

Is this can help?

max_buffered_bytes (uint32)

The threshold number of bytes buffered before triggering a flush to the broker. Default is 1.
frankyaorenjie commented 8 years ago

buffer bytes parameters will make effect according to the message size. Batch size is according to the number of messages and they are different parameters in Kafka manbook as well

cweaver1321 commented 8 years ago

@baniuyao Have you found a way to set the batch size? I am having a similar issue with the performance of the kafka producer. It appears to be sending 1 message no matter how fast the throughput is. It has become a serious bottleneck. I am hoping you have found a solution.

michaelgibson commented 8 years ago

Looks like this is using the https://github.com/Shopify/sarama library for the Kafka client. I do see an option to specify the number of messages to trigger a Flush here: https://github.com/Shopify/sarama/blob/88a4afb3d18f13212477a63a78e3a57ac87830a9/config.go#L117

It looks like k.saramaConfig.Producer.Flush.Messages would need to be added to the output:

https://github.com/mozilla-services/heka/blob/be1420d226808485a557ad622bb75785cadfa43a/plugins/kafka/kafka_output.go#L291

as well as a config option added.