spring-cloud / spring-cloud-stream-binder-kafka

Spring Cloud Stream binders for Apache Kafka and Kafka Streams
Apache License 2.0
331 stars 301 forks source link

Kafka creating more partitions than hardcoded in the properties file #1190

Closed rshmvarma closed 2 years ago

rshmvarma commented 2 years ago

Services migrated to Spring cloud function Spring boot version: 2.5.5 Dependencies : 2020.0.4

After migrating to Spring cloud function , our service no longer honors the partition count of the producers.

Configuration :

#Binder configuration
spring.cloud.stream.kafka.binder.auto-add-partitions=false
spring.cloud.stream.kafka.binder.requiredAcks=1
spring.cloud.stream.kafka.binder.configuration.max.poll.records=10
spring.cloud.stream.kafka.binder.replication-factor=4

spring.cloud.stream.bindings.priceAbc-out-0.content-type=application/json
spring.cloud.stream.bindings.priceAbc-out-0.destination=pc-abc-price
spring.cloud.stream.bindings.priceAbc-out-0.producer.header-mode=none
spring.cloud.stream.bindings.priceAbc-out-0.producer.partition-count=10
spring.cloud.stream.bindings.priceAbc-out-0.producer.partitionKeyExpression=payload.key
spring.cloud.stream.kafka.bindings.priceAbc-out-0.producer.sync=true

Error: o.s.kafka.support.LoggingProducerListener - Exception thrown when sending a message with key='byte[14]' and payload='byte[385]' to topic pc-abc-price and partition 19: org.apache.kafka.common.errors.TimeoutException: Topic pc-abc-price not present in metadata after 60000 ms.

and similar errors all with partition count greater than 10.

Does it mean the Partition algorithm is somehow failing here?

P.S. In our organization topics and partitions are already created and are not created as and when the services are created.

One more thing I would like to mention here: Earlier I had added the property spring.cloud.stream.kafka.binder.auto-add-partitions=true

But reverted it back later because of the organizational policy.

Could it have added more partitions and kafka is now looking for data in these partitions. If yes, then while describing the topic I should see those partitions , but when I describe the topic I see only partitions from count 0-4.

sobychacko commented 2 years ago

@rshm-rewe What is the binding destination? It seems like you are sending the records to a topic named pc-abc-price, but it complains about another topic called pc-aggregated-price-v1. Is there a chance you can create a minimal sample and share it with us? That way, we can triage the issue further.

rshmvarma commented 2 years ago

HI @sobychacko , thank you for your response , I was trying to replace all occurrence of pc-aggregated-price-v1 with pc-abc-price , missed that one. I will try to reproduce it in a sample project.

rshmvarma commented 2 years ago

The issue was with the following properties: spring.cloud.stream.bindings.priceAbc-out-0.producer.partitionKeyExpression=payload.key spring.cloud.stream.kafka.bindings.priceAbc-out-0.producer.sync=true

Somehow the payload.key was calculating the partitions with a number greater than mentioned partition count. I am not sure if you all would like to investigate further on it, so closing it.