Retry requests - Githubissues

StephenBrown2 commented 5 years ago

urllib3 has a very convenient Retry utility (docs) that I have found to be quite useful when dealing with flaky APIs. http3's Clients don't support this sort of thing yet, but I would love it if they did!

In the meantime, I can probably work out my own with a while loop checking the response code.

tomchristie commented 5 years ago

I think our tack onto this should probably be to match requests API wherever possible (so no built-in retry functionality). What we should defiantly do tho would be to do a good job of documenting how to implement this or other functionality with a custom dispatcher. That way we can make sure we're allowing for a range of differing behaviors, without having to extend our API surface area.

I'm not absolute on this one, so perhaps could reassess it at a later date, but for the time being I'd like to treat it as out of scope.

sethmlarson commented 5 years ago

I'm actually of the opposite opinion here. Every requests usage I've seen in the wild uses urllib3's Retry, let's not repeat that lapse of API. :)

tomchristie commented 5 years ago

Oh right, I was under the impression that it disables it, but I must be misremembering?

tomchristie commented 5 years ago

Okay, looks like it's disabled by default, but can be enabled in the HTTPAdapter API... https://github.com/kennethreitz/requests/blob/master/requests/adapters.py#L113

tomchristie commented 5 years ago

Not documented in the QuickStart or Advanced guides there, but it is part of the API reference. https://2.python-requests.org/en/master/api/#requests.adapters.HTTPAdapter

So, sure, let's treat that as in-scope.

Hellowlol commented 5 years ago

I have used backoff atm, but it would be nice to have some native solution 👍

StephenBrown2 commented 5 years ago

Should the urllib3 team be pinged on implementation details? I could potentially work on porting it straight over, but I'm unsure about where it would lie in this library, and if someone else knows better on how it would interact with Sync vs Async they would probably be better to implement.

sethmlarson commented 5 years ago

I'd love to be put on the review if you're willing to take a stab at implementation. :) No worries on getting it right the first time.

StephenBrown2 commented 5 years ago

Also want to give attribution where it's due, so would probably start with a copy and reference comment, then work on async-ifying it... Will put some effort into it over the next couple weeks. I don't have much free time, so I'm not opposed to duplicate work on it.

tomchristie commented 5 years ago

It might be worth tackling it API-first, before jumping in on the implementation.

How many controls and dials do we really want the retry API to provide? What critical bits of control over that are actually needed/used in the wild? Whatever is the most minimal possible sounds good to me - we can always extend it out further in time if needed.

We may also want to think about allowing for more complex cases through customization rather than configuration. Eg. only provide for specifying "allow N retries", but also provide an implementation hook that folks can build more complex cases off of.

That way you keep the API surface area nice and low, while still allowing flexiblity, or third party packages.

sethmlarson commented 5 years ago

I'll start by enumerating urllib3's existing functionality for Retry:

read Number of times that a read operation can fail
write Number of times that a write operation can fail
connect Number of times that an operation related to creating a connection can fail (Doesn't apply to read timeouts on TLS, those are under read)
redirect Number of times that a redirect can be followed
status Number of times that we can retry a request based on the received responses status code being in the status_forcelist and the request method being in method_whitelist.
Exponential backoff
raise_on_redirect and raise_on_status returns whether or not we should raise an error or just return the response.
respect_retry_after is whether a retry based on a response should respect he Retry-After header by sleeping.
remove_headers_on_redirect is a set of headers that should be removed from subsequent requests when a redirect is issued.

sethmlarson commented 5 years ago

IMO the raise_on_redirect and raise_on_status are things that don't need to be attached to the Retry class?

StephenBrown2 commented 5 years ago

Here's a typed skeleton of the current Retry object's public interface: https://gist.github.com/StephenBrown2/2fc6bab18b30037488deb0f4db92e001

sethmlarson commented 5 years ago

So I definitely want to make the new Retry object have one critical feature that actually makes sub-classing useful: Have a method that is subclassable that passes in the Request and the Response and then by some interface is allowed to decide whether a retry occurs and in addition also allows modifying the Request that will be emitted for that retry.

This exact functionality is tough to implement in urllib3 because we don't allow you to modify headers and we don't give the user everything they might want to know (such as the whole request that was sent)

Doing this allows so many things to be implemented simply by sub-classing the Retry. It's pretty critical actually that we get this interface right because there's a lot of functionality related to HTTP that involves multiple requests (Authentication, stripping headers on redirect, retry-after, resumable uploads, caching) that users have been panging for but is tough to implement without the right interface.

graingert commented 5 years ago

I'd rather Retry was an ABC and could be implemented without subclassing

sethmlarson commented 5 years ago

Could you explain the benefit of not needing to implement with sub-classing in this situation? I'm not seeing one right away.

tomchristie commented 5 years ago

There’s some great stuff to dig into here, tho I don’t think it’s on the critical path.

The API will be retries=int|RetryConfig available either per-client or per-request.

Finer grained control and/or a method override will then exist on the RetryConfig.

StephenBrown2 commented 5 years ago

Personally, I don't see a need to support the int option, as all the options could have good defaults and one can simply use retries=Retries() to get those.

What I would like to see, as the minimum for my needs right now, is:

total: int, Need this to be an overall limit, read/write specificity not needed, but I can see it would be useful as reads would be retryable much more than writes, though that can be handled with...
- Expected default: 3
status_codes: set, A set of status codes (Or HTTPStatus values?) to retry on, as the API I'm working with has no real concept of the difference between GET and POST, so also good would be...
- Expected default: frozenset({429, 503})
http_methods: set, A set of HTTP methods that can be retried, would take care of the read/write specificity as mentioned above.
- Expected default: frozenset({"HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"})
sleep/backoff: int, I would assume respect_retry_after to be True by default, and specifying the sleep option would be a constant time to wait between retries if there is no Retry-After header, while backoff would be an exponential backoff factor.
- Expected default: 0

https://pypi.org/project/backoff/ as mentioned by @Hellowlol might be something to look at for inspiration as well.

sethmlarson commented 5 years ago

Noting here that having respect_retry_after on by default can cause a DoS attack on the HTTP client by setting excessive wait times. If it's on by default we need a reasonably short max_retry_after value.

And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.

tomchristie commented 5 years ago

And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.

100%. No reason folks should have to think about this.

graingert commented 5 years ago

Noting here that having respect_retry_after on by default can cause a DoS attack on the HTTP client by setting excessive wait times. If it's on by default we need a reasonably short max_retry_after value.

And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.

How about just a max_retry_after which can be set to 0 to not respect Retry-After

sethmlarson commented 5 years ago

I'm okay with having max_retry_after being zero mean we don't respect Retry-After and I guess None being no limit? We just have to be careful designing APIs where specific values have special meanings because it makes extending those APIs tougher and more confusing for the user and makes code using our library harder to read. In this case I think that Retry-After is very well defined and unlikely to be extended so can be treated this way.

StephenBrown2 commented 5 years ago

Since 0 is Falsey, that would work, though I might prefer an explicit False being documented, the check would remain the same if respect_retry_after and max_retry_after: or similar.

sethmlarson commented 5 years ago

Per #134 we're going to hold off on implementing this feature.

mejuhi commented 3 years ago

I was looking into this retry feature to be available in httpx (use case is to retry based on certain http status codes). but it seems like this feature is still not supported yet (or i am trying to do something wrong). Just wanted to confirm if this feature is supported already? and if not, when can we expect this feature to be available.

jogu commented 2 years ago

I implemented retries in a HTTPTransport subclass recently. Whilst I can by no means claim this is good code, nor even that it's the right approach, nor that it would work for anything other than my specific use case, I thought sharing it might at least gives others a possible starting point:

https://gitlab.com/openid/conformance-suite/-/blob/master/scripts/conformance.py#L19

Above is for synchronous requests. There's an async version here:

https://gitlab.com/openid/conformance-suite/-/blob/httpx-async/scripts/conformance.py#L19

But I've not been able to get async requests to work reliably for me (for reasons currently unknown), so I'm currently only using the sync version. (I'm still trying to get to the bottom of various weird things going on, e.g. https://github.com/encode/httpx/discussions/2056 )

(Critiques of the code are very welcome. The suggestion of using 'backoff' above might've been a better approach, sadly I didn't notice that suggestion before I went this way.)

JartanFTW commented 2 years ago

I would love attention to be brought back to this feature implementation as time permits.

Kamforka commented 2 years ago

Is there any recommended way to implement a retry mechanism with httpx?

kovalevvlad commented 2 years ago

Would love to see this functionality in httpx. Started switching to the library and was disappointed by the lack of this feature or a suggested workaround, seems like a pretty common feature you would expect from a modern HTTP lib.

tomchristie commented 2 years ago

So... "retries" in the context of HTTP could describe several different type of use-case.

The valuable thing to do here would be to describe very specifically what behaviour you're trying to deal with.

Talking through an actual "here's my specific problem" use-case, will help move the conversation forward.

httpx does have connection retry functionality built-in, although it's not really highlighted in the documentation. That might or might not be what you're looking for.

matt-mercer commented 2 years ago

use case as a developer I would like to be able to quickly implement and consistently implement retry/backoff strategies with my http client, without having to re-write this each time ... much like this strategy .. https://honeyryderchuck.gitlab.io/httpx/wiki/Retries.html

matt-mercer commented 2 years ago

here's a retry http transport wrapper, inspired partially by urlib3.util.Retry

class RetryTransport(httpx.AsyncBaseTransport, httpx.BaseTransport):

    RETRYABLE_METHODS = frozenset(
        ["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"]
    )
    RETRYABLE_STATUS_CODES = frozenset([413, 429, 503, 504])

    MAX_BACKOFF_WAIT = 60

    def __init__(
        self,
        wrapped_transport: Union[httpx.BaseTransport, httpx.AsyncBaseTransport],
        max_attempts: int = 10,
        max_backoff_wait: float = MAX_BACKOFF_WAIT,
        backoff_factor: float = 0.1,
        jitter_ratio: float = 0.1,
        respect_retry_after_header: bool = True,
        retryable_methods: Iterable[str] = None,
        retry_status_codes: Iterable[int] = None

    ) -> None:
        self.wrapped_transport = wrapped_transport
        if jitter_ratio < 0 or jitter_ratio > 0.5:
            raise ValueError(f"jitter ratio should be between 0 and 0.5, actual {jitter_ratio}")

        self.max_attempts = max_attempts
        self.backoff_factor = backoff_factor
        self.respect_retry_after_header = respect_retry_after_header
        self.retryable_methods = frozenset(retryable_methods) if retryable_methods else self.RETRYABLE_METHODS
        self.retry_status_codes = frozenset(retry_status_codes) if retry_status_codes else self.RETRYABLE_STATUS_CODES
        self.jitter_ratio = jitter_ratio
        self.max_backoff_wait = max_backoff_wait

    def _calculate_sleep(self, attempts_made: int, headers: Union[httpx.Headers, Mapping[str, str]]) -> float:

        retry_after_header = (headers.get("Retry-After") or "").strip()
        if self.respect_retry_after_header and retry_after_header:
            if retry_after_header.isdigit():
                return float(retry_after_header)

            try:
                parsed_date = isoparse(retry_after_header).astimezone()  # converts to local time
                diff = (parsed_date - datetime.now().astimezone()).total_seconds()
                if diff > 0:
                    return min(diff, self.max_backoff_wait)
            except ValueError as _ex:
                pass

        backoff = self.backoff_factor * (2 ** (attempts_made - 1))
        jitter = (backoff * self.jitter_ratio) * random.choice([1, -1])
        total_backoff = backoff + jitter
        return min(total_backoff, self.max_backoff_wait)

    def handle_request(self, request: httpx.Request) -> httpx.Response:

        response = self.wrapped_transport.handle_request(request)

        if request.method not in self.retryable_methods:
            return response

        remaining_attempts = self.max_attempts - 1
        attempts_made = 1

        while True:

            if remaining_attempts < 1 or response.status_code not in self.retry_status_codes:
                return response

            response.close()

            sleep_for = self._calculate_sleep(attempts_made, response.headers)
            sleep(sleep_for)

            response = self.wrapped_transport.handle_request(request)

            attempts_made += 1
            remaining_attempts -= 1

    async def handle_async_request(self, request: httpx.Request) -> httpx.Response:

        response = await self.wrapped_transport.handle_async_request(request)

        if request.method not in self.retryable_methods:
            return response

        remaining_attempts = self.max_attempts - 1
        attempts_made = 1

        while True:

            if remaining_attempts < 1 or response.status_code not in self.retry_status_codes:
                return response

            response.close()

            sleep_for = self._calculate_sleep(attempts_made, response.headers)
            sleep(sleep_for)

            response = await self.wrapped_transport.handle_async_request(request)

            attempts_made += 1
            remaining_attempts -= 1

bede commented 2 years ago

Like @matt-mercer, I think this feature—even in limited form—would be incredibly valuable

rikroe commented 2 years ago

This would be a great feature to have (at least optionally), thanks @matt-mercer!

I tried to to implement and came across the following to issues:

in handle_async_request(), response.close() needs to be replaced with await response.aclose()
if a Retry-After is provided, that is longer than max_backoff_wait, what should happen? Just wait the given time or return the 429 response? The API I am dealing with returns for example Retry-After: 245 which feels a bit too long to wait...

DeepakArora76 commented 1 year ago

An improved version of what was proposed by @matt-mercer

import asyncio
import random
import time
from datetime import datetime
from functools import partial
from http import HTTPStatus
from typing import Any, Callable, Coroutine, Iterable, Mapping, Union

import httpx
from dateutil.parser import isoparse

class RetryTransport(httpx.AsyncBaseTransport, httpx.BaseTransport):
    """
    A custom HTTP transport that automatically retries requests using an exponential backoff strategy
    for specific HTTP status codes and request methods.

    Args:
        wrapped_transport (Union[httpx.BaseTransport, httpx.AsyncBaseTransport]): The underlying HTTP transport
            to wrap and use for making requests.
        max_attempts (int, optional): The maximum number of times to retry a request before giving up. Defaults to 10.
        max_backoff_wait (float, optional): The maximum time to wait between retries in seconds. Defaults to 60.
        backoff_factor (float, optional): The factor by which the wait time increases with each retry attempt.
            Defaults to 0.1.
        jitter_ratio (float, optional): The amount of jitter to add to the backoff time. Jitter is a random
            value added to the backoff time to avoid a "thundering herd" effect. The value should be between 0 and 0.5.
            Defaults to 0.1.
        respect_retry_after_header (bool, optional): Whether to respect the Retry-After header in HTTP responses
            when deciding how long to wait before retrying. Defaults to True.
        retryable_methods (Iterable[str], optional): The HTTP methods that can be retried. Defaults to
            ["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"].
        retry_status_codes (Iterable[int], optional): The HTTP status codes that can be retried. Defaults to
            [429, 502, 503, 504].

    Attributes:
        _wrapped_transport (Union[httpx.BaseTransport, httpx.AsyncBaseTransport]): The underlying HTTP transport
            being wrapped.
        _max_attempts (int): The maximum number of times to retry a request.
        _backoff_factor (float): The factor by which the wait time increases with each retry attempt.
        _respect_retry_after_header (bool): Whether to respect the Retry-After header in HTTP responses.
        _retryable_methods (frozenset): The HTTP methods that can be retried.
        _retry_status_codes (frozenset): The HTTP status codes that can be retried.
        _jitter_ratio (float): The amount of jitter to add to the backoff time.
        _max_backoff_wait (float): The maximum time to wait between retries in seconds.

    """

    RETRYABLE_METHODS = frozenset(["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"])
    RETRYABLE_STATUS_CODES = frozenset(
        [
            HTTPStatus.TOO_MANY_REQUESTS,
            HTTPStatus.BAD_GATEWAY,
            HTTPStatus.SERVICE_UNAVAILABLE,
            HTTPStatus.GATEWAY_TIMEOUT,
        ]
    )
    MAX_BACKOFF_WAIT = 60

    def __init__(
        self,
        wrapped_transport: Union[httpx.BaseTransport, httpx.AsyncBaseTransport],
        max_attempts: int = 10,
        max_backoff_wait: float = MAX_BACKOFF_WAIT,
        backoff_factor: float = 0.1,
        jitter_ratio: float = 0.1,
        respect_retry_after_header: bool = True,
        retryable_methods: Iterable[str] | None = None,
        retry_status_codes: Iterable[int] | None = None,
    ) -> None:
        """
        Initializes the instance of RetryTransport class with the given parameters.

        Args:
            wrapped_transport (Union[httpx.BaseTransport, httpx.AsyncBaseTransport]):
                The transport layer that will be wrapped and retried upon failure.
            max_attempts (int, optional):
                The maximum number of times the request can be retried in case of failure.
                Defaults to 10.
            max_backoff_wait (float, optional):
                The maximum amount of time (in seconds) to wait before retrying a request.
                Defaults to 60.
            backoff_factor (float, optional):
                The factor by which the waiting time will be multiplied in each retry attempt.
                Defaults to 0.1.
            jitter_ratio (float, optional):
                The ratio of randomness added to the waiting time to prevent simultaneous retries.
                Should be between 0 and 0.5. Defaults to 0.1.
            respect_retry_after_header (bool, optional):
                A flag to indicate if the Retry-After header should be respected.
                If True, the waiting time specified in Retry-After header is used for the waiting time.
                Defaults to True.
            retryable_methods (Iterable[str], optional):
                The HTTP methods that can be retried. Defaults to ['HEAD', 'GET', 'PUT', 'DELETE', 'OPTIONS', 'TRACE'].
            retry_status_codes (Iterable[int], optional):
                The HTTP status codes that can be retried.
                Defaults to [429, 502, 503, 504].
        """
        self._wrapped_transport = wrapped_transport
        if jitter_ratio < 0 or jitter_ratio > 0.5:
            raise ValueError(
                f"Jitter ratio should be between 0 and 0.5, actual {jitter_ratio}"
            )

        self._max_attempts = max_attempts
        self._backoff_factor = backoff_factor
        self._respect_retry_after_header = respect_retry_after_header
        self._retryable_methods = (
            frozenset(retryable_methods)
            if retryable_methods
            else self.RETRYABLE_METHODS
        )
        self._retry_status_codes = (
            frozenset(retry_status_codes)
            if retry_status_codes
            else self.RETRYABLE_STATUS_CODES
        )
        self._jitter_ratio = jitter_ratio
        self._max_backoff_wait = max_backoff_wait

    def handle_request(self, request: httpx.Request) -> httpx.Response:
        """
        Sends an HTTP request, possibly with retries.

        Args:
            request (httpx.Request): The request to send.

        Returns:
            httpx.Response: The response received.

        """
        transport: httpx.BaseTransport = self._wrapped_transport
        if request.method in self._retryable_methods:
            send_method = partial(transport.handle_request)
            response = self._retry_operation(request, send_method)
        else:
            response = transport.handle_request(request)
        return response

    async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
        """Sends an HTTP request, possibly with retries.

        Args:
            request: The request to perform.

        Returns:
            The response.

        """
        transport: httpx.AsyncBaseTransport = self._wrapped_transport
        if request.method in self._retryable_methods:
            send_method = partial(transport.handle_async_request)
            response = await self._retry_operation_async(request, send_method)
        else:
            response = await transport.handle_async_request(request)
        return response

    async def aclose(self) -> None:
        """
        Closes the underlying HTTP transport, terminating all outstanding connections and rejecting any further
        requests.

        This should be called before the object is dereferenced, to ensure that connections are properly cleaned up.
        """
        transport: httpx.AsyncBaseTransport = self._wrapped_transport
        await transport.aclose()

    def close(self) -> None:
        """
        Closes the underlying HTTP transport, terminating all outstanding connections and rejecting any further
        requests.

        This should be called before the object is dereferenced, to ensure that connections are properly cleaned up.
        """
        transport: httpx.BaseTransport = self._wrapped_transport
        transport.close()

    def _calculate_sleep(
        self, attempts_made: int, headers: Union[httpx.Headers, Mapping[str, str]]
    ) -> float:
        # Retry-After
        # The Retry-After response HTTP header indicates how long the user agent should wait before
        # making a follow-up request. There are three main cases this header is used:
        # - When sent with a 503 (Service Unavailable) response, this indicates how long the service
        #   is expected to be unavailable.
        # - When sent with a 429 (Too Many Requests) response, this indicates how long to wait before
        #   making a new request.
        # - When sent with a redirect response, such as 301 (Moved Permanently), this indicates the
        #   minimum time that the user agent is asked to wait before issuing the redirected request.
        retry_after_header = (headers.get("Retry-After") or "").strip()
        if self._respect_retry_after_header and retry_after_header:
            if retry_after_header.isdigit():
                return float(retry_after_header)

            try:
                parsed_date = isoparse(
                    retry_after_header
                ).astimezone()  # converts to local time
                diff = (parsed_date - datetime.now().astimezone()).total_seconds()
                if diff > 0:
                    return min(diff, self._max_backoff_wait)
            except ValueError:
                pass

        backoff = self._backoff_factor * (2 ** (attempts_made - 1))
        jitter = (backoff * self._jitter_ratio) * random.choice([1, -1])
        total_backoff = backoff + jitter
        return min(total_backoff, self._max_backoff_wait)

    async def _retry_operation_async(
        self,
        request: httpx.Request,
        send_method: Callable[..., Coroutine[Any, Any, httpx.Response]],
    ) -> httpx.Response:
        remaining_attempts = self._max_attempts
        attempts_made = 0
        while True:
            if attempts_made > 0:
                await asyncio.sleep(self._calculate_sleep(attempts_made, {}))
            response = await send_method(request)
            if (
                remaining_attempts < 1
                or response.status_code not in self._retry_status_codes
            ):
                return response
            await response.aclose()
            attempts_made += 1
            remaining_attempts -= 1

    def _retry_operation(
        self,
        request: httpx.Request,
        send_method: Callable[..., httpx.Response],
    ) -> httpx.Response:
        remaining_attempts = self._max_attempts
        attempts_made = 0
        while True:
            if attempts_made > 0:
                time.sleep(self._calculate_sleep(attempts_made, {}))
            response = send_method(request)
            if (
                remaining_attempts < 1
                or response.status_code not in self._retry_status_codes
            ):
                return response
            response.close()
            attempts_made += 1
            remaining_attempts -= 1

adamp87 commented 1 year ago

There is actually the AsyncHTTPTransport class, which has the 'retry' parameter. Its not documented explicitly, but you can see its usage at https://www.python-httpx.org/async/ under subsection "Explicit transport instances".

AsyncHTTPTransport is implemented in the httpcore package in connection class: https://github.com/encode/httpcore/blob/master/httpcore/_async/connection.py

I am using this to check on failed connection, not on bad HTTP satus. The only thing it misses to set the retry delays explicitly. It has an exponential backoff time implemented. Might be nice to be able to control the delay time in the future.

ghost commented 1 year ago

That is for retrying failed connections only. It does not retry on bad HTTP status codes as far as I can tell.

adamp87 commented 1 year ago

That is for retrying failed connections only. It does not retry on bad HTTP status codes as far as I can tell.

Thanks for pointing that out!

imadmoussa1 commented 1 year ago

is there a plan to add the RetryTransport behaviour to the project ? I can send a PR for it.

tomchristie commented 1 year ago

If someone if motivated to provide this functionality, then I'd suggest a third party package... https://www.python-httpx.org/third_party_packages/

encode / httpx

Retry requests #108