Feature Proposal: Add Retry Policies

arkoc commented 6 years ago

Here is what I propose:

public interface IApiClient
{
    [Retry(typeof(ExponentialRetry))]
    Task UpdateAsync(int num);

    [Retry(typeof(LinearRetry))]
    Task AddAsync(int num);

    [Retry(typeof(NoRetry))] // <- can be skipped
    Task<int> GetAll();
}

What you mind about this ?

P.S. Thank you for amazing library. P.S. 2 If you are ok with this idea, I would like to contribute it.

canton7 commented 6 years ago

[Retry(typeof(LinearRetry(retryCount: 3)))]

That's a syntax error :wink:

Retries are hard, which is why I've avoided them so far. You're welcome to contribute something, but I'd like to flesh out the details first.

Points that spring to mind:

We're going to need good unit testing
It needs to be user-extensible
How do you determine what errors to retry on, and what errors to give up on?

arkoc commented 6 years ago

Good catch at syntax error. :) It was just to introduce the basic idea.

I think, there is no need to invent bicycle again. So guys at Microsoft have implemented retry policies in Azure Storage Client Library. We can adopt them here.

Here is blog post about that.

Yeah. I implemented retry policies for my project, and it was fun to unit tests.
All retry policies are going to implement IRetryPolicy which whould have function like ShouldRetry.
From above blog-post:

“Non-retryable” exceptions are all 400 ( >=400 and <500) class exceptions (Bad gateway, Not Found, etc.) as well as 501 and 505.
All other exceptions are “retryable”. This includes client side timeouts.

canton7 commented 6 years ago

Looks good.

Do you think we need a retry policy per method, or could we put one on the RestClient? That would mean the user could just instantiate and configure and instance of the right policy, rather than having to provide a type in an attribute.

Another other question is whether the user should be supplying a retry policy, or an object which actually performs the retries. The former case would have methods to control whether a retry should be attempted for a given error, and what the delay time would be. The latter would take a Func<Task>, or the info to send to HttpClient.SendAsync, and would return the result / throw an error.

arkoc commented 6 years ago

Initially I was thinking that user should decide in its own on what functions to set retry policy, but now I am thinking that it will be overhead. So, in my opinion:

Retry policies should be configured in RestClient once and they will work for all calls. (I can't think a really good argument, why user should decide for what function to retry and what functions not.)
I think sticking with Interface -> Implementation will be more clean way. So user can provide his own implementation of IRetryPolicy. Methods can be more faster way for users, but the point is, that in most cases users gonna just use Linear or Exponential policies. And if they decide to implement something special, more likely it will be complex thing and zipping all that logic in one method will be headache.

arkoc commented 6 years ago

One question more: Should we retry if AllowAnyStatusCode is set ?

In my opinion, we shouldn't, because by setting AllowAnyStatusCode to true, user indicates that any response status is valid status for him.

But from other point of view retrying is for transient errors, so AllowAnyStatusCode this has nothing to do with them.

canton7 commented 6 years ago

Retry policies should be configured in RestClient once and they will work for all calls. (I can't think a really good argument, why user should decide for what function to retry and what functions not.)

Agreed. Some of the other user-configurable serializers, etc, accept information on the method which was called, so if a user really cares they can implement their own retryer which looks at the method which was called.

I think sticking with Interface -> Implementation will be more clean way. So user can provide his own implementation of IRetryPolicy. Methods can be more faster way for users, but the point is, that in most cases users gonna just use Linear or Exponential policies. And if they decide to implement something special, more likely it will be complex thing and zipping all that logic in one method will be headache.

(Just as an aside, use an abstract base class rather than an interface -- that way you can to make changes in a backwards-compatible way).

I'm still a little unclear on whether you prefer the user specifying a retry policy (a class which describes when to retry and how long to wait before retrying again), or a class which actually does the retrying. In the former case we keep calling the user's class saying "OK, we got this response, do you want to retry?", in the latter case we call the user's class saying "OK here's the request. Let me know when you've finished retrying".

It's quite possible we could get the best of both worlds - the user-supplied class is just given a request and told "here, perform this request until it succeeds", but we provide a base implementation which has abstract methods to figure out when to retry and for how long, and actually does the retry looping itself.

Whichever we do, we should also make sure that we let the user decide whether to retry based on the content of the headers and the response, not just its status code.

Should we retry if AllowAnyStatusCode is set ?

Tricky, yeah. AllowAnyStatusCode is a bit ham-fisted, in that it doesn't distinguish between different sorts of failure. The intention is to allow things like treating a 404 as a valid response, but it goes beyond that. I'm tempted to say that we should still retry by default (if nothing else, it's one less rule to have to remember when using retries), but that this needs to be configurable on the retrier (e.g. by telling the retrier whether the AllowAnyStatusCode attribute was declared on the method responsible for the current request).

arkoc commented 6 years ago

I'm still a little unclear on whether you prefer the user specifying a retry policy (a class which describes when to retry and how long to wait before retrying again), or a class which actually does the retrying. In the former case we keep calling the user's class saying "OK, we got this response, do you want to retry?", in the latter case we call the user's class saying "OK here's the request. Let me know when you've finished retrying".

Ah, now I get the question. So, I prefer for users to specify retry policy rather than retry logic itself, because basically retry looping is same, sending request is same.... As far as I understand in retrying workflow we have only 2 configurations:

When to retry ?
How long to retry ?

This both questions are answerable by policy, if we had configuration like How to retry? then it would be reasonable to let integration point to retrier. Or we can have both configurable policies and configurable retrier. But in my opinion its just making users life harder. I think it should be maximally easy for user to setup retry, even custom one.

So, here what I imagine for now: bool ShouldRetry(RetryContext retryContext, out TimeSpan retryInterval);

RetryContext should contain, HttpResponseMessage, method for which is request called and so on.

canton7 commented 6 years ago

One other thing to bear in mind is that requests can time out, in which case HttpClient will throw a TaskCanceledException. We need to distinguish this from a user-cancelled request. This also means that we can't just give a `HttpResponseMessage to the user and say "does this need retrying?" - we also need to indicate that a request timed out.

That got me thinking, and I think there are actually two concepts here: whether a request was a success / failure / needs retrying, and the retry algorithm (linear, exponential, etc). The former is currently handled by a combination of HttpResponseMessage.EnsureSuccessStatusCode and AllowAnyStatusCode, but that's a bit coarse-grained (it doesn't distinguish between different classes of status code, and doesn't allow for "failures" caused by the body of a 200 response).

Maybe what we want is an object which decides whether a response is a success, a failure, or needs retrying. If this indicates that a response was a success, we return it to the user. If it indicates that it was a failure, we throw an ApiException. If it indicates that it needs retrying, we pass it to the retry policy, which then decides whether to retry it or fail it (if there's no retry policy set, we automatically fail it). We'll also have to ensure that a timeout can never be considered a success, since there's no sensible response we can return to the client. This object is told whether [AllowAnyStatusCode] was set, and uses this to inform its decision.

The retry policy, then, just implements the retry algorithm. It gets told how many retries have occurred so far, and returns whether to retry again, and how long to wait. Again timeouts add complexity: if we have a linear timeout of 1 minute, and a request times out after 30 seconds, do we wait 30 seconds or 1 minute?

So to summarise (I haven't thought up any good names for these concepts and objects):

enum RequestResult { Success, Fail, Retry }
abstract class FailurePolicy // Or something, I can't think of a good name
{
    // Gets information about the request which timed out (including whether AllowAnyStatusCode was set), decides whether to retry on timeout
    abstract bool RetryOnTimeout(SomeContext context);

    // Gets information about a request and a response
    abstract RequestResult ResultForResponse(SomeOtherContext context);
}

abstract class RetryPolicy
{
    // timeSinceLastRequest ties into the question on how long to wait on a timeout
    abstract TimeSpan? ShouldRetry(int retryCount, TimeSpan timeSinceLastRequest);
}

arkoc commented 6 years ago

I am a little unclear, why we need to introduce new entity like RequestResult. So If user want to retry if there is a custom header aet, he should define 2 separate classes, one for RetryPolicy and one for FailurePolicy. In my opinion it is getting harder than it should.

For user who want to determine retry the call or not, we give him all contextual information, like ResponseMessage, thrown exception (if any)***, already retried count and so on. After that we expect from him to decide should we retry again or no. (Notice: We should call ShouldRetry for any response from server, even if there was exception. )

*** This also handles situation with TaskCancelledException.

arkoc commented 6 years ago

HttpResponseMessage lastResponse = null;
Exception lastException = null;
var shouldRetry = false;
var retryCount = 0;
do
{
    try
    {
        lastResponse = await this.httpClient.SendAsync(message, completionOption, requestInfo.CancellationToken).ConfigureAwait(false);
    }
    catch (Exception exc)
    {
        lastException = exc;
    } finally
    {
        var retryContext = new RetryContext(requestInfo, retryCount++, lastResponse, lastException);
        TimeSpan delay = TimeSpan.Zero;
        shouldRetry = RetryPolicy?.ShouldRetry(retryContext, out delay) ?? false;
        if (shouldRetry)
        {
            await Task.Delay(delay, requestInfo.CancellationToken).ConfigureAwait(false);
        } else
        {
            if (lastException != null)
            {
                throw lastException;
            }

            if (!requestInfo.AllowAnyStatusCode)
            {
                throw await ApiException.CreateAsync(message, lastResponse).ConfigureAwait(false);
            }
        }
    }
}
while (shouldRetry);

Please have a look at this. It is summarized version of my previous comment.

canton7 commented 6 years ago

I don't think it does get any harder for the user - it might even get easier.

By default, we have a policy which decides whether to success / fail / retry. The rules implemented by this policy are: on timeout, always retry; on 4xx, 501, 505, succeed if AllowAnyStatusCode was set, otherwise fail; for other non-success status codes, succeed if AllowAnyStatusCode was set, otherwise retry; otherwise succeed.

By default, we don't have a retry policy. Therefore any 'retry' responses from the failure policy get turned into 'fail'.

Therefore by default we keep the current behaviour.

If the user wants to add retrying, they just assign a retry policy (linear or exponential). We start retrying on timeouts, 4xx, 501 and 505; we still fail on other non-success status codes.

If the user wants a strange retry policy (maybe with some randomness?), they implement their own retry policy. They don't need to worry about what sort of responses need retrying and what don't -- all they have to implement is the logic which says how long to wait before retrying the request. This is where we simplify things for the user.

Completely separate to this is the logic which determines whether a response was successful or not. This augments the AllowAnyStatusCode attribute. For example, if the user decides that 404 should be considered as successful in all cases they can do that with their own failure policy. If they want to define their own attribute which they can apply to methods indicating that 404 is considered successful for that method (i.e. much more fine-grained control than AllowAnyStatusCode), they can do that.

If the user wants to use a linear retry policy, but they want to say that timeouts shouldn't be retried, then we've also simplified things for them here: they can use the out-of-the-box linear retry policy, and just customise the failure policy. If they want to keep their custom don't-retry-on-timeout behaviour but switch to an exponential policy that's 1 line of code (changing the retry policy).

arkoc commented 6 years ago

I agreed. Your approach will give more flexibility and in same time makes custom implementations more easier.

canton7 commented 6 years ago

So really, RetryPolicy becomes more of a RetryAlgorithm. It really only needs to have a method which returns an IEnumerable<TimeSpan>.

drmcclelland commented 6 years ago

Polly (https://github.com/App-vNext/Polly) is a really great library with patterns like Retry & Circuit Breaker that would be a good combination to use with RestEase.

canton7 commented 6 years ago

That looks really awesome, can't believe I've never come across it! I'm strongly inclined to use something pre-existing over rolling our own, especially something like Polly that's so feature-rich.

The challenge comes with figuring out how we integrate with Polly (or some abstracting which lets people use Polly or some similarly library) in a way which:

Feels natural
Lets people do things like calling a method which returns a strongly-typed object, while letting Polly get the HttpResponseMessage (simply handling ApiException may be enough for this, but it means deviating from the examples in the Polly docs)
Doesn't add a dependency on Polly for everyone, just people who want that functionality

canton7 commented 6 years ago

I've spent a bit of time reading the Polly docs.

Polly wants to work by taking over responsibility for executing your request. So the easiest way of "adding" retry support is to tell users "configure a Polly policy, then use it to execute your interface methods". However, that has the following issues:

Users have to wrap every call to an interface method. Fine, but a bit ugly.
People reading the Polly docs will expect to handle a HttpResponseMessage, HttpStatusCode, HttpRequestException, etc, but we will give them an ApiException or a strongly-typed result.

So perhaps a better system is to introduce a layer between the HttpResponseMessage and the deserialization which we let users hook into, which works in the world of HttpResponseMessages and lets users hook into their policy's Execute method. That means we'll have one policy for the whole API, which is good because the user doesn't need to repeat the policy for every request, but had because the user can't have per-request policies (which are needed for things like fallbacks).

We could let users pass a policy into an interface method, but I don't want to explicitly support Polly's types. There's also debatable difference between using a policy to execute an interface method, and passing a policy into an interface method to be executed.

Talking about fallbacks, we would want them to work with the deserialized object types, so we want to be working in concrete types and not HttpResponseMessages at this point.

At the end of the day, RestEase has always been about making common operations easier, while complex operations stay about the same (returning a HttpResponseMessage, doing the deserialization manually, etc). So we need to figure out what the common Polly use-cases are, then focus on making those easier.

canton7 commented 6 years ago

Also need to think about allowing Policy to cancel the request - should be easy.

Also PolicyResult -- I don't think there's a neat way we can handle this? Unless we somehow allow a Task<PolicyResult>?

Note that fallbacks only make sense for per-method policies - every method on the interface isn't going to return the same thing. We could potentially take a fallback, or delegate which returns a fallback, as a method parameter, and pass it though into the policy in some semi-sensible way? However we'd have to run a runtime check that the type of fallback was compatible with the type of result...

We also can't forget about [AllowAnyStatusCode]

arkoc commented 6 years ago

Another way of integrating Polly is quite easy with custom DelegatingHandler.

/// <summary>
///  A <see cref="DelegatingHandler"/> implementation that executes request processing surrounded by a <see cref="Policy"/>.
/// </summary>
public class PolicyHttpMessageHandler : DelegatingHandler
{
    private readonly Policy<HttpResponseMessage> _policy;

    /// <summary>
    /// Initializes a new instance of <see cref="PolicyHttpMessageHandler"/>.
    /// </summary>
    /// <param name="policy">Desired policy for a handler.</param>
    public PolicyHttpMessageHandler(Policy<HttpResponseMessage> policy)
    {
        _policy = policy ?? throw new ArgumentNullException(nameof(policy));
    }

    /// <inheritdoc />
    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        return _policy.ExecuteAsync(() => base.SendAsync(request, cancellationToken));
    }
}

Then add custom HttpClinet to Restier as described here https://github.com/canton7/RestEase#custom-httpclient .

Here is more comprehensive DelegatingHandler : https://github.com/aspnet/HttpClientFactory/blob/0677ac9f1e3efeb7557291ed42c726c60bde9c93/src/Microsoft.Extensions.Http.Polly/PolicyHttpMessageHandler.cs

cd21h commented 6 years ago

+1 for using Polly with HttpClientFactory.

clydevassallo commented 4 years ago

Is it currently possible to register RestEase clients in the ServiceCollection using the HttpClientFactory and IHttpClientBuilder?

canton7 commented 4 years ago

It's not something I've played with, but I've heard of someone doing that quite easily. I've got plans to add a package to explicitly support that at some point: I imagine it will look a bit like Refit's version

clydevassallo commented 4 years ago

That would be great. To my understanding, that would allow us to easily register RestEase clients with retries policies through Microsoft's PollyHttpClientBuilderExtensions and custom Delegating Handlers through Microsoft's HttpClientBuilderExtensions.

At the end the registration would look something like:

services
    .AddRestEaseClient<IHelloWorldClient>()
    .AddPolicyHandler(retryPolicy)
    .AddHttpMessageHandler<MyCustomDelegatingHandler>()

canton7 commented 4 years ago

Since you have a direct need right now, would you be able to put together an extension method which works for you, which I can use as a starting point? It's been a while since I used asp.net.

clydevassallo commented 4 years ago

I could definitely try. Let me give it a shot and open a PR.

canton7 commented 4 years ago

Or even just post it in here. I'm working on a large-scale refactoring to support source generators at the moment, and I won't be able to take any other changes until that's complete.

clydevassallo commented 4 years ago

Will do. Whatever works best for you :)

clydevassallo commented 4 years ago

public static class ServiceCollectionExtensions
{
    private static readonly MethodInfo addRestEaseClientGenericMethodInfo = typeof(ServiceCollectionExtensions).GetTypeInfo().GetMethods().First(x => x.Name == "AddRestEaseClient" && x.IsStatic && x.GetParameters().Length == 1 && x.IsGenericMethod);

    public static IHttpClientBuilder AddRestEaseClient<T>(this IServiceCollection services) where T : class
    {
        return services
            .AddHttpClient(typeof(T).ToString())
            .AddTypedClient((client, serviceProvider) => RestClient.For<T>(client));
    }

    public static IHttpClientBuilder AddRestEaseClient(this IServiceCollection services, Type type)
    {
        if (type == null)
            throw new ArgumentNullException(nameof(type));

        var method = addRestEaseClientGenericMethodInfo.MakeGenericMethod(type);
        return (IHttpClientBuilder) method.Invoke(null, new object[]{ services });
    }
}

I created a simple initial version and it is working. I wrote some unit tests for basic cases and so far so good.

As you can see, this implementation hides access to the configuration of your RestClient. The next step would be to expose this configuration by adding a parameter to these extension methods with a type called RestEaseSettings.

This new type would contain the following properties:

JsonSerializerSettings
IResponseDeserializer
IRequestBodySerializer
RequestPathParamSerializer
IRequestQueryParamSerializer
QueryStringBuilder
IFormatProvider If set by the caller, these properties will override the defaults in the RestClient.

I could continue developing this if you're interested in adding the functionality.

canton7 commented 4 years ago

Thanks! I'll look at this when I've got some bandwidth.

canton7 commented 4 years ago

@clydevassallo Do you think this looks sensible? https://github.com/canton7/RestEase/compare/develop...feature/httpclientfactory

clydevassallo commented 4 years ago

@canton7 Looks good :) I like the idea of having the configuration through an Action.

canton7 commented 3 years ago

Closing with the next release: Polly is the preferred approach, and I've added docs to the README explaining how to use it with RestEase.

canton7 / RestEase

Feature Proposal: Add Retry Policies #89