Closed StephenBrown2 closed 5 years ago
I think our tack onto this should probably be to match requests
API wherever possible (so no built-in retry functionality). What we should defiantly do tho would be to do a good job of documenting how to implement this or other functionality with a custom dispatcher. That way we can make sure we're allowing for a range of differing behaviors, without having to extend our API surface area.
I'm not absolute on this one, so perhaps could reassess it at a later date, but for the time being I'd like to treat it as out of scope.
I'm actually of the opposite opinion here. Every requests
usage I've seen in the wild uses urllib3's Retry, let's not repeat that lapse of API. :)
Oh right, I was under the impression that it disables it, but I must be misremembering?
Okay, looks like it's disabled by default, but can be enabled in the HTTPAdapter API... https://github.com/kennethreitz/requests/blob/master/requests/adapters.py#L113
Not documented in the QuickStart or Advanced guides there, but it is part of the API reference. https://2.python-requests.org/en/master/api/#requests.adapters.HTTPAdapter
So, sure, let's treat that as in-scope.
I have used backoff atm, but it would be nice to have some native solution 👍
Should the urllib3 team be pinged on implementation details? I could potentially work on porting it straight over, but I'm unsure about where it would lie in this library, and if someone else knows better on how it would interact with Sync vs Async they would probably be better to implement.
I'd love to be put on the review if you're willing to take a stab at implementation. :) No worries on getting it right the first time.
Also want to give attribution where it's due, so would probably start with a copy and reference comment, then work on async-ifying it... Will put some effort into it over the next couple weeks. I don't have much free time, so I'm not opposed to duplicate work on it.
It might be worth tackling it API-first, before jumping in on the implementation.
How many controls and dials do we really want the retry API to provide? What critical bits of control over that are actually needed/used in the wild? Whatever is the most minimal possible sounds good to me - we can always extend it out further in time if needed.
We may also want to think about allowing for more complex cases through customization rather than configuration. Eg. only provide for specifying "allow N retries", but also provide an implementation hook that folks can build more complex cases off of.
That way you keep the API surface area nice and low, while still allowing flexiblity, or third party packages.
I'll start by enumerating urllib3's existing functionality for Retry
:
read
Number of times that a read operation can failwrite
Number of times that a write operation can failconnect
Number of times that an operation related to creating a connection can fail (Doesn't apply to read timeouts on TLS, those are under read
)redirect
Number of times that a redirect can be followedstatus
Number of times that we can retry a request based on the received responses status code being in the status_forcelist
and the request method being in method_whitelist
.raise_on_redirect
and raise_on_status
returns whether or not we should raise an error or just return the response.respect_retry_after
is whether a retry based on a response should respect he Retry-After
header by sleeping.remove_headers_on_redirect
is a set of headers that should be removed from subsequent requests when a redirect is issued.IMO the raise_on_redirect
and raise_on_status
are things that don't need to be attached to the Retry
class?
Here's a typed skeleton of the current Retry object's public interface: https://gist.github.com/StephenBrown2/2fc6bab18b30037488deb0f4db92e001
So I definitely want to make the new Retry object have one critical feature that actually makes sub-classing useful: Have a method that is subclassable that passes in the Request and the Response and then by some interface is allowed to decide whether a retry occurs and in addition also allows modifying the Request that will be emitted for that retry.
This exact functionality is tough to implement in urllib3 because we don't allow you to modify headers and we don't give the user everything they might want to know (such as the whole request that was sent)
Doing this allows so many things to be implemented simply by sub-classing the Retry. It's pretty critical actually that we get this interface right because there's a lot of functionality related to HTTP that involves multiple requests (Authentication, stripping headers on redirect, retry-after, resumable uploads, caching) that users have been panging for but is tough to implement without the right interface.
I'd rather Retry was an ABC and could be implemented without subclassing
Could you explain the benefit of not needing to implement with sub-classing in this situation? I'm not seeing one right away.
There’s some great stuff to dig into here, tho I don’t think it’s on the critical path.
The API will be retries=int|RetryConfig
available either per-client or per-request.
Finer grained control and/or a method override will then exist on the RetryConfig.
Personally, I don't see a need to support the int
option, as all the options could have good defaults and one can simply use retries=Retries()
to get those.
What I would like to see, as the minimum for my needs right now, is:
total
: int
, Need this to be an overall limit, read/write specificity not needed, but I can see it would be useful as reads would be retryable much more than writes, though that can be handled with...
3
status_codes
: set
, A set of status codes (Or HTTPStatus
values?) to retry on, as the API I'm working with has no real concept of the difference between GET
and POST
, so also good would be...
frozenset({429, 503})
http_methods
: set
, A set of HTTP methods that can be retried, would take care of the read
/write
specificity as mentioned above.
frozenset({"HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"})
sleep
/backoff
: int
, I would assume respect_retry_after
to be True
by default, and specifying the sleep
option would be a constant time to wait between retries if there is no Retry-After
header, while backoff
would be an exponential backoff factor.
0
https://pypi.org/project/backoff/ as mentioned by @Hellowlol might be something to look at for inspiration as well.
Noting here that having respect_retry_after
on by default can cause a DoS attack on the HTTP client by setting excessive wait times. If it's on by default we need a reasonably short max_retry_after
value.
And I'm of the opinion that having great defaults for the RetryConfig
makes having an int
-> total
being acceptable being ever more desirable rather than less.
And I'm of the opinion that having great defaults for the RetryConfig makes having an int -> total being acceptable being ever more desirable rather than less.
100%. No reason folks should have to think about this.
Noting here that having
respect_retry_after
on by default can cause a DoS attack on the HTTP client by setting excessive wait times. If it's on by default we need a reasonably shortmax_retry_after
value.And I'm of the opinion that having great defaults for the
RetryConfig
makes having anint
->total
being acceptable being ever more desirable rather than less.
How about just a max_retry_after
which can be set to 0 to not respect Retry-After
I'm okay with having max_retry_after
being zero mean we don't respect Retry-After and I guess None
being no limit? We just have to be careful designing APIs where specific values have special meanings because it makes extending those APIs tougher and more confusing for the user and makes code using our library harder to read. In this case I think that Retry-After is very well defined and unlikely to be extended so can be treated this way.
Since 0
is False
y, that would work, though I might prefer an explicit False
being documented, the check would remain the same if respect_retry_after and max_retry_after:
or similar.
Per #134 we're going to hold off on implementing this feature.
I was looking into this retry feature to be available in httpx (use case is to retry based on certain http status codes). but it seems like this feature is still not supported yet (or i am trying to do something wrong). Just wanted to confirm if this feature is supported already? and if not, when can we expect this feature to be available.
I implemented retries in a HTTPTransport subclass recently. Whilst I can by no means claim this is good code, nor even that it's the right approach, nor that it would work for anything other than my specific use case, I thought sharing it might at least gives others a possible starting point:
https://gitlab.com/openid/conformance-suite/-/blob/master/scripts/conformance.py#L19
Above is for synchronous requests. There's an async version here:
https://gitlab.com/openid/conformance-suite/-/blob/httpx-async/scripts/conformance.py#L19
But I've not been able to get async requests to work reliably for me (for reasons currently unknown), so I'm currently only using the sync version. (I'm still trying to get to the bottom of various weird things going on, e.g. https://github.com/encode/httpx/discussions/2056 )
(Critiques of the code are very welcome. The suggestion of using 'backoff' above might've been a better approach, sadly I didn't notice that suggestion before I went this way.)
I would love attention to be brought back to this feature implementation as time permits.
Is there any recommended way to implement a retry mechanism with httpx
?
Would love to see this functionality in httpx. Started switching to the library and was disappointed by the lack of this feature or a suggested workaround, seems like a pretty common feature you would expect from a modern HTTP lib.
So... "retries" in the context of HTTP could describe several different type of use-case.
The valuable thing to do here would be to describe very specifically what behaviour you're trying to deal with.
Talking through an actual "here's my specific problem" use-case, will help move the conversation forward.
httpx
does have connection retry functionality built-in, although it's not really highlighted in the documentation. That might or might not be what you're looking for.
use case as a developer I would like to be able to quickly implement and consistently implement retry/backoff strategies with my http client, without having to re-write this each time ... much like this strategy .. https://honeyryderchuck.gitlab.io/httpx/wiki/Retries.html
here's a retry http transport wrapper, inspired partially by urlib3.util.Retry
class RetryTransport(httpx.AsyncBaseTransport, httpx.BaseTransport):
RETRYABLE_METHODS = frozenset(
["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"]
)
RETRYABLE_STATUS_CODES = frozenset([413, 429, 503, 504])
MAX_BACKOFF_WAIT = 60
def __init__(
self,
wrapped_transport: Union[httpx.BaseTransport, httpx.AsyncBaseTransport],
max_attempts: int = 10,
max_backoff_wait: float = MAX_BACKOFF_WAIT,
backoff_factor: float = 0.1,
jitter_ratio: float = 0.1,
respect_retry_after_header: bool = True,
retryable_methods: Iterable[str] = None,
retry_status_codes: Iterable[int] = None
) -> None:
self.wrapped_transport = wrapped_transport
if jitter_ratio < 0 or jitter_ratio > 0.5:
raise ValueError(f"jitter ratio should be between 0 and 0.5, actual {jitter_ratio}")
self.max_attempts = max_attempts
self.backoff_factor = backoff_factor
self.respect_retry_after_header = respect_retry_after_header
self.retryable_methods = frozenset(retryable_methods) if retryable_methods else self.RETRYABLE_METHODS
self.retry_status_codes = frozenset(retry_status_codes) if retry_status_codes else self.RETRYABLE_STATUS_CODES
self.jitter_ratio = jitter_ratio
self.max_backoff_wait = max_backoff_wait
def _calculate_sleep(self, attempts_made: int, headers: Union[httpx.Headers, Mapping[str, str]]) -> float:
retry_after_header = (headers.get("Retry-After") or "").strip()
if self.respect_retry_after_header and retry_after_header:
if retry_after_header.isdigit():
return float(retry_after_header)
try:
parsed_date = isoparse(retry_after_header).astimezone() # converts to local time
diff = (parsed_date - datetime.now().astimezone()).total_seconds()
if diff > 0:
return min(diff, self.max_backoff_wait)
except ValueError as _ex:
pass
backoff = self.backoff_factor * (2 ** (attempts_made - 1))
jitter = (backoff * self.jitter_ratio) * random.choice([1, -1])
total_backoff = backoff + jitter
return min(total_backoff, self.max_backoff_wait)
def handle_request(self, request: httpx.Request) -> httpx.Response:
response = self.wrapped_transport.handle_request(request)
if request.method not in self.retryable_methods:
return response
remaining_attempts = self.max_attempts - 1
attempts_made = 1
while True:
if remaining_attempts < 1 or response.status_code not in self.retry_status_codes:
return response
response.close()
sleep_for = self._calculate_sleep(attempts_made, response.headers)
sleep(sleep_for)
response = self.wrapped_transport.handle_request(request)
attempts_made += 1
remaining_attempts -= 1
async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
response = await self.wrapped_transport.handle_async_request(request)
if request.method not in self.retryable_methods:
return response
remaining_attempts = self.max_attempts - 1
attempts_made = 1
while True:
if remaining_attempts < 1 or response.status_code not in self.retry_status_codes:
return response
response.close()
sleep_for = self._calculate_sleep(attempts_made, response.headers)
sleep(sleep_for)
response = await self.wrapped_transport.handle_async_request(request)
attempts_made += 1
remaining_attempts -= 1
Like @matt-mercer, I think this feature—even in limited form—would be incredibly valuable
This would be a great feature to have (at least optionally), thanks @matt-mercer!
I tried to to implement and came across the following to issues:
handle_async_request()
, response.close()
needs to be replaced with await response.aclose()
Retry-After
is provided, that is longer than max_backoff_wait
, what should happen? Just wait the given time or return the 429 response?
The API I am dealing with returns for example Retry-After: 245
which feels a bit too long to wait...An improved version of what was proposed by @matt-mercer
import asyncio
import random
import time
from datetime import datetime
from functools import partial
from http import HTTPStatus
from typing import Any, Callable, Coroutine, Iterable, Mapping, Union
import httpx
from dateutil.parser import isoparse
class RetryTransport(httpx.AsyncBaseTransport, httpx.BaseTransport):
"""
A custom HTTP transport that automatically retries requests using an exponential backoff strategy
for specific HTTP status codes and request methods.
Args:
wrapped_transport (Union[httpx.BaseTransport, httpx.AsyncBaseTransport]): The underlying HTTP transport
to wrap and use for making requests.
max_attempts (int, optional): The maximum number of times to retry a request before giving up. Defaults to 10.
max_backoff_wait (float, optional): The maximum time to wait between retries in seconds. Defaults to 60.
backoff_factor (float, optional): The factor by which the wait time increases with each retry attempt.
Defaults to 0.1.
jitter_ratio (float, optional): The amount of jitter to add to the backoff time. Jitter is a random
value added to the backoff time to avoid a "thundering herd" effect. The value should be between 0 and 0.5.
Defaults to 0.1.
respect_retry_after_header (bool, optional): Whether to respect the Retry-After header in HTTP responses
when deciding how long to wait before retrying. Defaults to True.
retryable_methods (Iterable[str], optional): The HTTP methods that can be retried. Defaults to
["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"].
retry_status_codes (Iterable[int], optional): The HTTP status codes that can be retried. Defaults to
[429, 502, 503, 504].
Attributes:
_wrapped_transport (Union[httpx.BaseTransport, httpx.AsyncBaseTransport]): The underlying HTTP transport
being wrapped.
_max_attempts (int): The maximum number of times to retry a request.
_backoff_factor (float): The factor by which the wait time increases with each retry attempt.
_respect_retry_after_header (bool): Whether to respect the Retry-After header in HTTP responses.
_retryable_methods (frozenset): The HTTP methods that can be retried.
_retry_status_codes (frozenset): The HTTP status codes that can be retried.
_jitter_ratio (float): The amount of jitter to add to the backoff time.
_max_backoff_wait (float): The maximum time to wait between retries in seconds.
"""
RETRYABLE_METHODS = frozenset(["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"])
RETRYABLE_STATUS_CODES = frozenset(
[
HTTPStatus.TOO_MANY_REQUESTS,
HTTPStatus.BAD_GATEWAY,
HTTPStatus.SERVICE_UNAVAILABLE,
HTTPStatus.GATEWAY_TIMEOUT,
]
)
MAX_BACKOFF_WAIT = 60
def __init__(
self,
wrapped_transport: Union[httpx.BaseTransport, httpx.AsyncBaseTransport],
max_attempts: int = 10,
max_backoff_wait: float = MAX_BACKOFF_WAIT,
backoff_factor: float = 0.1,
jitter_ratio: float = 0.1,
respect_retry_after_header: bool = True,
retryable_methods: Iterable[str] | None = None,
retry_status_codes: Iterable[int] | None = None,
) -> None:
"""
Initializes the instance of RetryTransport class with the given parameters.
Args:
wrapped_transport (Union[httpx.BaseTransport, httpx.AsyncBaseTransport]):
The transport layer that will be wrapped and retried upon failure.
max_attempts (int, optional):
The maximum number of times the request can be retried in case of failure.
Defaults to 10.
max_backoff_wait (float, optional):
The maximum amount of time (in seconds) to wait before retrying a request.
Defaults to 60.
backoff_factor (float, optional):
The factor by which the waiting time will be multiplied in each retry attempt.
Defaults to 0.1.
jitter_ratio (float, optional):
The ratio of randomness added to the waiting time to prevent simultaneous retries.
Should be between 0 and 0.5. Defaults to 0.1.
respect_retry_after_header (bool, optional):
A flag to indicate if the Retry-After header should be respected.
If True, the waiting time specified in Retry-After header is used for the waiting time.
Defaults to True.
retryable_methods (Iterable[str], optional):
The HTTP methods that can be retried. Defaults to ['HEAD', 'GET', 'PUT', 'DELETE', 'OPTIONS', 'TRACE'].
retry_status_codes (Iterable[int], optional):
The HTTP status codes that can be retried.
Defaults to [429, 502, 503, 504].
"""
self._wrapped_transport = wrapped_transport
if jitter_ratio < 0 or jitter_ratio > 0.5:
raise ValueError(
f"Jitter ratio should be between 0 and 0.5, actual {jitter_ratio}"
)
self._max_attempts = max_attempts
self._backoff_factor = backoff_factor
self._respect_retry_after_header = respect_retry_after_header
self._retryable_methods = (
frozenset(retryable_methods)
if retryable_methods
else self.RETRYABLE_METHODS
)
self._retry_status_codes = (
frozenset(retry_status_codes)
if retry_status_codes
else self.RETRYABLE_STATUS_CODES
)
self._jitter_ratio = jitter_ratio
self._max_backoff_wait = max_backoff_wait
def handle_request(self, request: httpx.Request) -> httpx.Response:
"""
Sends an HTTP request, possibly with retries.
Args:
request (httpx.Request): The request to send.
Returns:
httpx.Response: The response received.
"""
transport: httpx.BaseTransport = self._wrapped_transport
if request.method in self._retryable_methods:
send_method = partial(transport.handle_request)
response = self._retry_operation(request, send_method)
else:
response = transport.handle_request(request)
return response
async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
"""Sends an HTTP request, possibly with retries.
Args:
request: The request to perform.
Returns:
The response.
"""
transport: httpx.AsyncBaseTransport = self._wrapped_transport
if request.method in self._retryable_methods:
send_method = partial(transport.handle_async_request)
response = await self._retry_operation_async(request, send_method)
else:
response = await transport.handle_async_request(request)
return response
async def aclose(self) -> None:
"""
Closes the underlying HTTP transport, terminating all outstanding connections and rejecting any further
requests.
This should be called before the object is dereferenced, to ensure that connections are properly cleaned up.
"""
transport: httpx.AsyncBaseTransport = self._wrapped_transport
await transport.aclose()
def close(self) -> None:
"""
Closes the underlying HTTP transport, terminating all outstanding connections and rejecting any further
requests.
This should be called before the object is dereferenced, to ensure that connections are properly cleaned up.
"""
transport: httpx.BaseTransport = self._wrapped_transport
transport.close()
def _calculate_sleep(
self, attempts_made: int, headers: Union[httpx.Headers, Mapping[str, str]]
) -> float:
# Retry-After
# The Retry-After response HTTP header indicates how long the user agent should wait before
# making a follow-up request. There are three main cases this header is used:
# - When sent with a 503 (Service Unavailable) response, this indicates how long the service
# is expected to be unavailable.
# - When sent with a 429 (Too Many Requests) response, this indicates how long to wait before
# making a new request.
# - When sent with a redirect response, such as 301 (Moved Permanently), this indicates the
# minimum time that the user agent is asked to wait before issuing the redirected request.
retry_after_header = (headers.get("Retry-After") or "").strip()
if self._respect_retry_after_header and retry_after_header:
if retry_after_header.isdigit():
return float(retry_after_header)
try:
parsed_date = isoparse(
retry_after_header
).astimezone() # converts to local time
diff = (parsed_date - datetime.now().astimezone()).total_seconds()
if diff > 0:
return min(diff, self._max_backoff_wait)
except ValueError:
pass
backoff = self._backoff_factor * (2 ** (attempts_made - 1))
jitter = (backoff * self._jitter_ratio) * random.choice([1, -1])
total_backoff = backoff + jitter
return min(total_backoff, self._max_backoff_wait)
async def _retry_operation_async(
self,
request: httpx.Request,
send_method: Callable[..., Coroutine[Any, Any, httpx.Response]],
) -> httpx.Response:
remaining_attempts = self._max_attempts
attempts_made = 0
while True:
if attempts_made > 0:
await asyncio.sleep(self._calculate_sleep(attempts_made, {}))
response = await send_method(request)
if (
remaining_attempts < 1
or response.status_code not in self._retry_status_codes
):
return response
await response.aclose()
attempts_made += 1
remaining_attempts -= 1
def _retry_operation(
self,
request: httpx.Request,
send_method: Callable[..., httpx.Response],
) -> httpx.Response:
remaining_attempts = self._max_attempts
attempts_made = 0
while True:
if attempts_made > 0:
time.sleep(self._calculate_sleep(attempts_made, {}))
response = send_method(request)
if (
remaining_attempts < 1
or response.status_code not in self._retry_status_codes
):
return response
response.close()
attempts_made += 1
remaining_attempts -= 1
There is actually the AsyncHTTPTransport class, which has the 'retry' parameter. Its not documented explicitly, but you can see its usage at https://www.python-httpx.org/async/ under subsection "Explicit transport instances".
AsyncHTTPTransport is implemented in the httpcore package in connection class: https://github.com/encode/httpcore/blob/master/httpcore/_async/connection.py
I am using this to check on failed connection, not on bad HTTP satus. The only thing it misses to set the retry delays explicitly. It has an exponential backoff time implemented. Might be nice to be able to control the delay time in the future.
That is for retrying failed connections only. It does not retry on bad HTTP status codes as far as I can tell.
That is for retrying failed connections only. It does not retry on bad HTTP status codes as far as I can tell.
Thanks for pointing that out!
is there a plan to add the RetryTransport behaviour to the project ? I can send a PR for it.
If someone if motivated to provide this functionality, then I'd suggest a third party package... https://www.python-httpx.org/third_party_packages/
urllib3 has a very convenient Retry utility (docs) that I have found to be quite useful when dealing with flaky APIs. http3's Clients don't support this sort of thing yet, but I would love it if they did!
In the meantime, I can probably work out my own with a while loop checking the response code.