Add an exponential backoff to the retry function #1993

Closed RicePatrick closed 2 weeks ago

This PR adds a 200% exponential backoff to the default retry function. This backoff only applies in situations where the Rate Limit header doesn't exist, which should be vanishingly few situations, but unfortunately does exist within the SaaS environment of GitLab: https://gitlab.com/gitlab-org/gitlab/-/issues/365728

Without the backoff, all 5 default retries would be exhausted in < 1-2 seconds, which would likely result in 429 errors being passed upstream since the API limits on SaaS are measured in requests per minute. With the backoff, the 100ms backoff translates to 3.2 seconds on the final wait (which isn't horrible). If a user increased the retry up to 10 requests, the final retry would wait just shy of 2 minutes.

This feels like a reasonable low impact change that would result in a good quality of life improvement for most users of the library :)

I also debated making the exponent configurable, but I'm not sure that's necessary, since you can already control the min and the max.

@svanharmelen - Want to take a look at this PR and let me know what you think?

Hmm... Why not use this instead: https://github.com/xanzy/go-gitlab/blob/main/client_options.go#L37

@svanharmelen - I debated it honestly, but this seemed like something that would make sense globally instead of something that was unique to the TF Provider. If you don't think it makes sense here, that's what I'd use on the provider :)

OK... But if you only want this to be applied in case the Rate Limit header doesn't exists, shouldn't the function be more like this?

// rateLimitBackoff provides a callback for Client.Backoff which will use the
// RateLimit-Reset header to determine the time to wait. We add some jitter
// to prevent a thundering herd.
//
// min and max are mainly used for bounding the jitter that will be added to
// the reset time retrieved from the headers. But if the final wait time is
// less then min, min will be used instead.
func rateLimitBackoff(min, max time.Duration, attemptNum int, resp *http.Response) time.Duration {
    // rnd is used to generate pseudo-random numbers.
    rnd := rand.New(rand.NewSource(time.Now().UnixNano()))

    // First create some jitter bounded by the min and max durations.
    jitter := time.Duration(rnd.Float64() * float64(max-min))

    if resp != nil {
        if v := resp.Header.Get(headerRateReset); v != "" {
            if reset, _ := strconv.ParseInt(v, 10, 64); reset > 0 {
                // Only update min if the given time to wait is longer.
                if wait := time.Until(time.Unix(reset, 0)); wait > min {
                    min = wait
                }
            }
        } else {
            // In case the RateLimit-Reset header is not set, back off an additional
            // 100% exponentially. With the default milliseconds being set to 100 for
            // `min`, this makes the 5th retry wait 3.2 seconds (3,200 ms) by default.
            min = time.Duration(float64(min) * math.Pow(2, float64(attemptNum)))
        }
    }

    return min + jitter
}

@svanharmelen - Sure! That'd probably make a bit more sense; I'll update it to that :)

Appreciate the lift!

xanzy / go-gitlab

Add an exponential backoff to the retry function #1993