I was considering fixing some of the ClusterAdmin functions that seem to be missing the retry-on-ErrNoController logic. However, I noticed that the retryOnError implementation was a little surprising/confusing so I thought I'd ask about the intended behaviour before started improving the other methods.
Firstly, there is an off-by-one in the "X retries remaining" logs. So it reports " attempts remaining" after making one attempt and "1 attempts remaining" when it is going to make no further attempts.
Secondly, it sleeps after the final failed retry rather than immediately returning the failure which is surprising to me and should probably be documented if it is intentional or corrected.
I'd be happy to submit a PR to fix both of these issues but I wasn't sure what the preferred fix for the second issue would be?
You can see the odd behaviour with the following test.
func Test_retryOnError(t *testing.T) {
config := NewTestConfig()
config.Version = V1_0_0_0
config.Admin.Retry.Max = 3
config.Admin.Retry.Backoff = 100 * time.Millisecond
var buf bytes.Buffer
Logger = log.New(&buf, "[sarama] ", log.LstdFlags)
admin := &clusterAdmin{conf: config}
startTime := time.Now()
admin.retryOnError(func(error) bool { return true }, func() error {
return errors.New("mock error")
})
if strings.Contains(buf.String(), "3 attempts remaining") {
t.Fatalf("incorrect number of attempts remaining reported; should be max of 2 since one attempt has already been made")
}
if time.Since(startTime) >= 300*time.Millisecond {
t.Fatalf("attempt+sleep+attempt+sleep+attempt should take less than 3 * backoff time")
}
}
Versions
Problem Description
I was considering fixing some of the ClusterAdmin functions that seem to be missing the retry-on-ErrNoController logic. However, I noticed that the retryOnError implementation was a little surprising/confusing so I thought I'd ask about the intended behaviour before started improving the other methods.
Firstly, there is an off-by-one in the "X retries remaining" logs. So it reports " attempts remaining" after making one attempt and "1 attempts remaining" when it is going to make no further attempts.
Secondly, it sleeps after the final failed retry rather than immediately returning the failure which is surprising to me and should probably be documented if it is intentional or corrected.
I'd be happy to submit a PR to fix both of these issues but I wasn't sure what the preferred fix for the second issue would be?
You can see the odd behaviour with the following test.