wbarnha / kafka-python-ng

Fork for Python client for Apache Kafka
https://wbarnha.github.io/kafka-python-ng/
Apache License 2.0
67 stars 8 forks source link

Feat: FNV1a 32-Bit Partitioner #141

Open wbarnha opened 6 months ago

wbarnha commented 6 months ago

Summary

This PR introduces FNV1-a 32-bit hasher as a new partitioner. The tests confirmed that this implementation yields the same hash and partition selection as Goka/Sarama.

Why

Kafka leaves the partition selection up to the producer. This creates a potential discrepancy issue when different libraries are used. As different libraries across different programming languages may use different hashers, or different configurations for them, the messages may arrive at partitions different than expected. When different hashers or different configurations are used, the message with the same key may arrive at an unexpected partition.

This was the case for us at Beat. We realized that messages that are produced by Python services using Kafka-python arrive at partitions that are different than what consumer services expect. We realized that this is because Kafka-python uses murmur2 as the default partitioner, while Goka library uses Sarama which uses FNV1a-32.

What

This PR introduces a new partitioner that is based on FNV1a 32-bit. It is a separate and isolated implementation. Therefore users can select using the default one, or this one easily.

The partitioner is designed to match the partitioner of Goka/Sarama. It uses the "a" variant of FNV based on 32-bit. It also utilizes twos-complement decimal conversions, and the same parameters used for mentioned libraries. As a result, with this partitioner, python-kafka calculates the same hash and selects the same partition as Goka/Sarama does. In our tests, we experimentally proved that this implementation outputs the same hash and partitioner as Goka/Sarama does.

Who

I worked together on this change with Evgenia Martynova at Beat. Beat is a ride-hailing company that has engineering hubs in Athens and Amsterdam. We are always on the look for great engineers! We are hiring!


This change is Reviewable