akka / alpakka-kafka

Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
https://doc.akka.io/libraries/alpakka-kafka/current/
Other
1.42k stars 386 forks source link

Producer: max out producing until Kafka backpressures #909

Closed ennru closed 5 years ago

ennru commented 5 years ago

Short description

To max out producing to Kafka until the Kafka producer API blocks, we can increase the parallelism a lot and let the designated akka.kafka.default-dispatcher take the blocking.

Details

From Kafka's API docs

The buffer.memory controls the total amount of memory available to the producer for buffering. If records are sent faster than they can be transmitted to the server then this buffer space will be exhausted. When the buffer space is exhausted additional send calls will block. The threshold for time to block is determined by max.block.ms after which it throws a TimeoutException.

By increasing the producer parallelism (current default 100) to a quite large value, Alpakka Kafka would not normally backpressure for uncompleted sends, but the producer.send would start blocking. For that blocking to be taken care of, the stage runs on the designated dispatcher akka.kafka.default-dispatcher.

ennru commented 5 years ago

For Alpakka Kafka 2.0 we should simply change the default parallelism to something significantly higher.

ennru commented 5 years ago

The new default parallelism is now 10,000 with #944. This shows quite an improvement for our benchmarks even for transactional use.