dpkp / kafka-python

Python client for Apache Kafka
http://kafka-python.readthedocs.io/
Apache License 2.0
5.59k stars 1.4k forks source link

In the KafkaProducer class, what does batch_size means? #2277

Open bruno-brant opened 2 years ago

bruno-brant commented 2 years ago

The documentation doesn't make it clear whether the parameter controls number of messages or total bytes:

        batch_size (int): Requests sent to brokers will contain multiple
            batches, one for each partition with data available to be sent.
            A small batch size will make batching less common and may reduce
            throughput (a batch size of zero will disable batching entirely).
            Default: 16384
Courouge commented 2 years ago

Hi Bruno, The batch size control total bytes not number of message. You can guess the number of message by the average size of your messages. Doc on Confluent or Apache Kafka

There is two trigger in Apache Kafka to send produce queries: linger_ms and batch_size param (available in this python client). If one of them is reach, the producer send messages.