Open wbarnha opened 6 months ago
sample code pulled from one of our internal applications:
# kafka_producer is configured with: # "key_serializer": json.dumps, # "value_serializer": json.dumps, key = None # None produces round-robin if Const.FIELD_USER in message: key = message[Const.FIELD_USER] kafka_producer.send(topic, key=key, value=message)
Unsurprisingly, using json.dumps will serialize key=None to 'null'.
json.dumps
key=None
'null'
Surprisingly, this results in key=None behaving as if it were a keyed message and always being sent to a single partition rather than round-robining.
This is because the serialization layer is processed before the partitioning logic. So by the time https://github.com/dpkp/kafka-python/blob/1.4.4/kafka/partitioner/default.py#L24 is hit, the key is already the string 'null'.
I found this extremely surprising... at a minimum we need to call this out in the docs.
Alternatively, we could offer default helpers that handle null keys/values (for deleting messages in compacted topics) in a less surprising way.
Related: https://github.com/dpkp/kafka-python/issues/913.
sample code pulled from one of our internal applications:
Unsurprisingly, using
json.dumps
will serializekey=None
to'null'
.Surprisingly, this results in
key=None
behaving as if it were a keyed message and always being sent to a single partition rather than round-robining.This is because the serialization layer is processed before the partitioning logic. So by the time https://github.com/dpkp/kafka-python/blob/1.4.4/kafka/partitioner/default.py#L24 is hit, the key is already the string
'null'
.I found this extremely surprising... at a minimum we need to call this out in the docs.
Alternatively, we could offer default helpers that handle null keys/values (for deleting messages in compacted topics) in a less surprising way.
Related: https://github.com/dpkp/kafka-python/issues/913.