hashicorp / go-retryablehttp

Retryable HTTP client in Go
Mozilla Public License 2.0
1.99k stars 251 forks source link

DefaultBackoff does not incorporate timeout which may lead to hanging requests #157

Open ThomasJanUta opened 2 years ago

ThomasJanUta commented 2 years ago

If a server sets the waiting time for the status "429 TooManyRequests" to 24 hours or even more, the client will hang for this amount of time. This scenario is not uncommon for servers that implement drastic rate limits.

This line in the backoff function

if resp.StatusCode == http.StatusTooManyRequests || resp.StatusCode == http.StatusServiceUnavailable

allows to wait if the server currently does not accept requests.

You can set a timeout for the http.client: client.HTTPClient.Timeout = time.Duration(timeout) * time.Second but it does not apply to the DefaultBackoff function.

Is this the desired default behavior?


This is the fix in my code I use. It currently still retries but at least it's not hanging:

func NewBackoff(min, max time.Duration, attemptNum int, resp *http.Response) time.Duration {
    timeout := int64(30)                                           // define timeout again
    if resp != nil {
        if resp.StatusCode == http.StatusTooManyRequests || resp.StatusCode == http.StatusServiceUnavailable {
            if s, ok := resp.Header["Retry-After"]; ok {
                if sleep, err := strconv.ParseInt(s[0], 10, 64); err == nil {
                    if sleep > timeout {             // waiting time supplied by server must not exceed timeout
                        return time.Duration(0)
                    }
                    return time.Second * time.Duration(sleep)
                }
            }
        }
    }

    mult := math.Pow(2, float64(attemptNum)) * float64(min)
    sleep := time.Duration(mult)
    if float64(sleep) != mult || sleep > max {
        sleep = max
    }
    return sleep
}

Is there a better fix? I can offer a pull requests if wanted.

arlandism commented 2 years ago

Agreed this seems undesirable.

Since max is passed in, it feels like the sleep time should be: min(serverRetryAfter, max)

SimonRichardson commented 2 years ago

Wouldn't a better fix to pass in a context, so that it can be controlled by the requester via context.WithTimeout or similar. That way we can just cancel it at any point anyway. This gives more flexibility, the only downside is that it's a breaking change.