Closed cweerasooriya closed 1 year ago
Unless I'm mistaken, KIP-360 (mentioned in that ticket) implies that it's fundamentally unsafe to retry on UnknownProducerID. UnknownProducerID was originally intended to allow clients to recover sending records, but the way to retry when a client sees UnknownProducerID is for the client to reset the sequence numbers of records. This effectively makes the client act as if it is publishing to the partition anew. KIP-360 explains that this is fundamentally unsafe and can lead to mistaken duplicates. The key line: "For the idempotent producer, the user can choose to fail or they can continue (with the possibility of duplication or reordering). If the user continues, the epoch will be bumped locally and the sequence number will be reset."
I think a fix for this in Sarama would be a config knob where users can opt in to unsafe recovery (this is similar to the path I chose on my own client, where instead users need to opt out of automatically continuing on potential data loss).
I agree with @twmb above. It is definitely unsafe to retry on UnknownProducerID.
If we feel that fix is to provide a config option to understand the consequence and keep going, let's talk more about that. Pinging @d1egoaz @bai
Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the master branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.
Versions
Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.
Configuration
What configuration values are you using for Sarama and Kafka?
Logs
When filing an issue please provide logs from Sarama and Kafka if at all possible. You can set
sarama.Logger
to alog.Logger
to capture Sarama debug output.logs: CLICK ME
``` kafka server: The broker could not locate the producer metadata associated with the Producer ID. ```
Problem Description
We are publishing to a low traffic topic. Our producer encountered this error yesterday. According to https://issues.apache.org/jira/browse/KAFKA-7190 this error could happen in low traffic topics and the producer should retry on the error.
We observed that Sarama is not retrying on this error which causes the producer to fail.