segmentio / kafka-go

Kafka library in Go
MIT License
7.54k stars 780 forks source link

SASL Authentication Failed Error #1183

Open solpagolaae opened 1 year ago

solpagolaae commented 1 year ago

When we try to produce a message to the kafka topic we receive this error "kafka.(*Client).Produce: [58] SASL Authentication Failed: SASL Authentication failed: [####]: Session too short."

Looking at the code we can identify that "SASL Authentication Failed" is produced when we got an EOF error, calling authenticateSASL(), specifically when we try to send a request to a kafka broker, as "kafka.(*Client).Produce" is part of the error and produced Produce() method from produce.go. But we couldn't find where "Session too short" is added to the error.

This error when we retry is recovered, but we need to discover what is causing the issue.

Kafka Version

  • What version(s) of Kafka are you testing against?3.2.0
  • What version of kafka-go are you using?v.0.4.38

We don't have the steps to reproduce the issue.

Resources to reproduce the behavior:

We were using https://github.com/segmentio/kafka-go/tree/f4ca0b48296538eb5639027cd83f40a3828caf7d/sasl/aws_msk_iam, but now we migrated to https://github.com/segmentio/kafka-go/tree/f4ca0b48296538eb5639027cd83f40a3828caf7d/sasl/aws_msk_iam_v2, in both versions we are seeing the same error.

Definition of saslMechanism for IAM v2 that we have: config --> "github.com/aws/aws-sdk-go-v2/config"

    cfg, err := config.LoadDefaultConfig(ctx, config.WithRegion(appConfig.AwsRegion))
saslMechanism := aws_msk_iam_v2.NewMechanism(cfg)

For Dialer and Transport: kafka -->"github.com/segmentio/kafka-go"

dialer = &kafka.Dialer{
    Timeout:       appConfig.DialTimeout,
    DualStack:     true,
    SASLMechanism: saslMechanism,
    TLS:           &tls.Config{},
}
transport = &kafka.Transport{
    SASL:        saslMechanism,
    TLS:         &tls.Config{},
    MetadataTTL: appConfig.MetadataTtl,
}

Expected Behavior

Do not have the error. It'd be useful also to add more details into the error logs of this function, to identify exactly where it's generated.

Additional Context

Could this be related to https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate#KIP368:AllowSASLConnectionstoPeriodicallyReAuthenticate-DelayingSupportforBrokersKillingConnections ?

petedannemann commented 1 year ago

I think this is likely due to the same underlying issue as https://github.com/segmentio/kafka-go/issues/1093, which is exactly what you mentioned at the bottom of your issue about us needing to support re authentication. kafka-go does not currently support that

petedannemann commented 1 year ago

This comment here maybe provides some more insight https://github.com/twmb/franz-go/issues/249#issuecomment-1357853449