confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
267 stars 3.15k forks source link

"consistent_random" and "murmur2_random" partitioners treat empty keys differently #4510

Open Quuxplusone opened 11 months ago

Quuxplusone commented 11 months ago

(Reporting this as an issue mainly so that there is a canonical place it's documented, for internal reference and for posterity.)

librdkafka provides several partitioners out-of-the-box:

They have the following behaviors:

Notice the inconsistency between consistent_random (the default) and murmur2_random (which various Internet folks recommend switching to: 1, 2). consistent_random will distribute zero-length keys among random partitions; murmur2_hash will send all zero-length keys to a single partition. There are two ways this could bite someone who switches partitioning schemes:

Because changing the behavior in either direction has the potential to break someone, I don't think there's any way for librdkafka to fix this inconsistency between how consistent_random and murmur2_random handle empty keys. I just wish we had known about it before it bit us.

emasab commented 11 months ago

@Quuxplusone thanks for sharing this. Hope will help people being aware of the changes to apply before switching to murmur2_random, that we cannot set as default for the same reason.

benissimo commented 11 months ago

Could you make murmur2_random the default and bump up the version of librdkafka, noting this as a breaking change, warning users to explicitly choose consistent_random if they were relying on it up til now (for the reasons listed here)?

I appreciate not wanting to surprise existing users. But for new users, it would be great if they inherited murmur2_random by default, so they don't run into issues the moment they ever start to share a topic between Java and non-Java producers. See https://www.confluent.io/blog/standardized-hashing-across-java-and-non-java-producers/

emasab commented 11 months ago

@benissimo we have a list of breaking changes that we want to do in a planned major release, and this is already included. We should also deprecate them before removing in next version and give time to developers to know that. In case we do a major release we should also backport fixes to previous major version until it goes out of support.

2.x was needed more than planned because of the upgrade of OSSL to 3.x but for the rest it's compatible.

This is to say that at some point it will be needed but we have to plan it and we're currently focused on these features: KIP-848, KIP-714 and KIP-951