App-vNext / Polly

Polly is a .NET resilience and transient-fault-handling library that allows developers to express policies such as Retry, Circuit Breaker, Timeout, Bulkhead Isolation, and Fallback in a fluent and thread-safe manner. From version 6.0.1, Polly targets .NET Standard 1.1 and 2.0+.
https://www.thepollyproject.org
BSD 3-Clause "New" or "Revised" License
13.32k stars 1.22k forks source link

Supplying an async `onRetry` delegate compiles but compiler assignment to async void lambda causes delegate completion not to be awaited #107

Closed PhilipAnthonyMurray closed 8 years ago

PhilipAnthonyMurray commented 8 years ago

I have an issue where I need to call an async method from the Retry block that changes authorisation criteria within an object before executing the query again.

          query = bookingQuery;

            try
            {
                bookings = await Policy
                    .Handle<ApiException>()
                    .RetryAsync(2, async (ex, retry) =>
                        {
                            await Task.Run(async () => await RefreshApiAuthentication(retry));
                            System.Diagnostics.Debugger.Break();
                        })
                    .ExecuteAsync(async () => await GetBookingsApiCall());

            }
            catch (Exception ex)
            {
                System.Diagnostics.Debugger.Break();
            }
        private async Task RefreshApiAuthentication(int retryCount)
        {
            switch (retryCount)
            {
                case 1:
                    query.Authorisation = await GetRefreshAuthorisation(query.Authorisation, query.ClubId);
                    System.Diagnostics.Debugger.Break();
                    break;
                case 2: 
                    query.Authorisation = await GetRefreshAuthorisation(query.Authorisation, query.ClubId);
                    System.Diagnostics.Debugger.Break();
                    break;
                default:
                    break;
            }
        }

This problem I have is that in the RefreshApiAuthentication method once the async method is called the ExecuteAsync method is called by Polly. Is there a method of calling async methods from the Policy RetryAsync method that would await before executing the retry?

Great library by the way.

reisenberger commented 8 years ago

Hi @PhilipAnthonyMurray Thanks for your interest in Polly!

What I think happening here is that the Polly retry policies declare the onRetry delegate as synchronous: onRetry is of type Action<...> rather than Func<..., Task>. It looks like you (we :smile:) are then falling foul of some 'helpful' (in this case unhelpful?) compiler magic which allows you to assign the async lambda to Action<...>.

As the actual signature of the Polly onRetry delegate is Action<...>, Polly cannot await it, and calls it as onRetry(); rather than await onRetry();. The delegate is probably therefore ~completing as soon as it has kicked off the background Task you .Run()~ (CORRECTION: returning as soon as it hits the first await) - without waiting for it to complete, as you observe - leaving Polly continuing to the next try of .ExecuteAsync(...) straight after. ... And all due to that compiler magic assignment of an async lambda to Action<...>! (which effectively seems to silently drop all the async/await behaviour without warning you)

--- Background ---

This kind of gotcha with async void lambdas is described by the noted async expert Stephen Cleary in this article

Async void methods are tricky because you can assign a lambda like async () => { await Task.Yield(); } to a variable of type Action, even though the natural type of that lambda is Func.

and this one

One subtle trap is passing an async lambda to a method taking an Action parameter; in this case, the async lambda returns void and inherits all the problems of async void methods. As a general rule, async lambdas should only be used if they’re converted to a delegate type that returns Task (for example, Func).

and from other angles, by Stephen Toub of the relevant Microsoft team here

(one of the subtlest areas of async imo!)

EDIT: Another good discussion of this (number 4) and similar async gotchas: http://tomasp.net/blog/csharp-async-gotchas.aspx/

EDIT: And Stephen Toub nails it again as number 4 here: http://blogs.msdn.com/b/pfxteam/archive/2013/01/28/psychic-debugging-of-async-methods.aspx

--- Background ends ---

What to do?

It looks like an opportunity was missed to allow Polly policies to take fully async onRetry delegates when async was first introduced to Polly (prior App-vNext stewardship; and we have evidently not picked up and dealt with since).

@PhilipAnthonyMurray If I work on a fix for this over the coming days, would you be happy to trial it (see if it resolves your issue?), alongside obviously the full unit tests I put in place.

I do (did?) also have a concern over whether we will be able to introduce this (two different forms of the onRetry delegate) without the compiler complaining about ambiguous method resolution, or without taking a breaking change for existing async users - which would clearly lead to a difficult decision. However, the end of the Stephen Cleary article suggests we should be fine here:

As a closing note, the C# compiler has been updated in VS2012 to correctly perform overload resolution in the presence of async lambdas.

Clearly also, it would be possible for you to write a blocking onRetry delegate instead, but this is not ideal: let's see if we can do better in Polly!

Thanks

PhilipAnthonyMurray commented 8 years ago

@reisenberger Thank you for the thorough explanation which makes sense based on the behaviour I see. I took some time this afternoon checking the source code and came to a similar conclusion.

I am very happy to test any changes in this, the code is not due for production for some time yet so presents little actual risk. I am out of the country on business for most of next week but will happily test from the hotel. If there is anything else I can help with please let me know.

reisenberger commented 8 years ago

@PhilipAnthonyMurray Great, thanks for your interest and support.

@joelhulen @PhilipAnthonyMurray A couple of potential issues with this which we should work through:

1: Visual Studio 2010 apparently will not handle the ambiguous method resolution of an async lambda between Action<...> and Func<..., Task>. Since @joelhulen @lumirris we are currently exploring adding async support for .NET4.0 via Microsoft.Bcl.Async under #103 - and .NET4.0 is I believe available on Visual Studio 2010 - there is a potential issue for the small group of users who may be on both VS2010 and .NET4.0 who'd use the new package from #103.

For this group of users, ambiguous async lambdas would I believe fail to compile. There are viable workrounds however, as discussed here. One can explicitly construct new Action(async () => { ...}) or new Func<Task>(async () => { ...}). And we could name the parameters onRetry in one overload, and onRetryAsync in another, allowing users to select an overload using named parameter syntax.

@joelhulen Since these workrounds are necessary only for 2010 technology (assuming a small/diminshing group of users), I don't believe we should let this block moving onRetry to async for the wider user constituency - but views welcome.

2: circuit-breaker state-change delegates: The case for/against allowing async/await for the onBreak / onReset / onHalfOpen delegates of the circuit-breaker is more complex.

The circuit-breaker state-change delegates intentionally run within the breaker's state management locks for the reason set out foot of here: that without this, in a high-throughput, multi-threaded scenario, the state implied by raising the state change delegate could fail to hold (could be superseded by events elsewhere) during the lifetime of the delegate execution.

Now, it's well documented that mixing await and lock was originally debarred, as explained by luminaries Eric Lippert and Jon Skeet here . On the other hand, there are now workrounds for this: use SemaphoreSlim.WaitAsync() with a SemaphoreSlim of capacity 1, as described here; or the techniques here from Scott Hanselman or Stephen Toub again.

At this point, the jury is still out for me on the wisdom of permitting async/await for the breaker's state-change delegates - further reflection needed. In general, a blocking / long-running circuit-breaker state-change delegate, since it necessarily runs within the breaker's locks, is a bad idea [will hit performance], since it would block all other operations through the circuit-breaker while the await was awaiting. But it may be technically possible, caveat emptor.


Apologies for the long post, but some complex issues here which need a clear airing.

reisenberger commented 8 years ago

Complete final aside: I wondered why the compiler should do such a thing: silently allow an async lambda to be assigned to an Action when Func<Task> is the correct type - and couldn't immediately find any online explanation. Hunching that it may have been to do with wanting to introduce interoperability between new async methods in the newer parts of the BCL, and some older parts of .NET (perhaps delegates on events fixed as of type Actions). If anyone knows an explanation, shout!

EDIT: It is to support async void event handlers eg for UI elements.

Kind-a shame the compiler drops the await functionality silently. Maybe there is a compiler warning that could be elevated to a build fail. @PhilipAnthonyMurray Did you get any compiler warning on this?

PhilipAnthonyMurray commented 8 years ago

@reisenberger If I do something as below no compiler warnings are given.

            bookings = await Policy
                .Handle<Refit.ApiException>()
                .RetryAsync(2, async (ex, retry) =>
                    {
                        await RefreshApiAuthentication(retry);
                    })
                .ExecuteAsync(async () => await GetBookingsApiCall());
reisenberger commented 8 years ago

@PhilipAnthonyMurray I was able to reproduce the issue in a small unit test. And the expected fix indeed makes the test pass.

I just pushed this fix for testing to https://github.com/reisenberger/Polly/tree/AsyncOnRetryDelegate . It would be great (if you have availability) if you were able to pull that branch down to your local machine, compile, and test the fix also against your RefreshApiAuthentication() / GetBookingsApiCall() scenario. Thanks!

The fix has been applied at this stage only to the RetryAsync(...) flavour of retry. Before we are ready to nuget package this, some repetitive work needed to apply the fix to other retry flavours (on the assumption it tests good in the wild :smile: ).

reisenberger commented 8 years ago

@PhilipAnthonyMurray Thanks for the feedback re compiler warning! On this, I've concluded (sadly) that the compiler gives no warning for assigning Func<Task> silently to Action. My unit test exemplifying the issue gives no compiler warnings. And your reduced sample gives no compiler warning, but still contains the async (ex, retry) => delegate, which must (with Polly 4.2) be undergoing such assignment.

If ...

.RetryAsync(2, async (ex, retry) =>
                        {
                            await Task.Run(async () => await RefreshApiAuthentication(retry));
                            System.Diagnostics.Debugger.Break();
                        })

... gave a compiler warning, I'd be interested in what it was (but not critical).

PhilipAnthonyMurray commented 8 years ago

@reisenberger Nope the anonymous async lambda wrapped in a task does not give a warning in either VS2015 or Xamarin Studio.

PhilipAnthonyMurray commented 8 years ago

@reisenberger I will try to test your update tomorrow. Thanks for the quick turn around./

PhilipAnthonyMurray commented 8 years ago

@reisenberger Sorry for the delay in getting back to you but my trip over to the states has wiped out my time. I will get back to you as soon as possible.

reisenberger commented 8 years ago

@PhilipAnthonyMurray No problem! (no urgency). And: thanks for your collaboration!

reisenberger commented 8 years ago

Similar fix for all other async retry variants now pushed to https://github.com/reisenberger/Polly/tree/AsyncOnRetryDelegate

reisenberger commented 8 years ago

Hi @PhilipAnthonyMurray . Many thanks once again for raising this issue! We've a fix ready to release: let us know if you envisage having time to / are interested to test this against your original scenario in the next few days. No worries if not: the unit tests clearly demonstrated, and demonstrate fixed, the issue. Thanks!

reisenberger commented 8 years ago

A final note to record more accurately what the compiler is doing and why. The nub is that an async method that has no return value can take either of the signatures:

public async void AsyncFooNoReturnValue1(...) { ... } // [1] : maps to Action<...>
public async Task AsyncFooNoReturnValue2(...) { ... } // [2] : maps to Func<..., Task>

The compiler can map an async lambda anonymous method async (...) => { /* no return value */ } to either of these (the async lambda itself does not and cannot indicate which is preferred).

Therefore if, as library creators, we only supply Action<...> parameter types, the compiler, with no other option, will map an async delegate to form [1]. (We may, as library creators, have meant that we only wish to accept sync delegates. However, as Action is essentially ambiguous - it could accept either a sync or async delegate - we do not unfortunately have the option to specify explicitly which we accept.)

If, as library creators, we supply overloads taking both forms (Action and Func<Task>), compilers from VS2012 onwards will prefer to resolve async lambdas to form [2]. (VS2010 will raise an ambiguous method invocation compile error.)

(Libraries generally could also navigate this dilemma by only offering form [2]; for Polly however this would represent a breaking change, as [1] had already been in the wild for some time.) EDIT: And many cases may only need a synchronous onRetry delegate anyway.

The problematic form [1] owes its existence (as I rightly surmised earlier) to a language requirement to support fire-and-forget void-returning event handlers wanting to use async functionality.

PhilipAnthonyMurray commented 8 years ago

@reisenberger I'm definitely still interested in this and hope to progress this issue this week.

reisenberger commented 8 years ago

@PhilipAnthonyMurray Great, thanks. Very many thanks for your collaboration!

reisenberger commented 8 years ago

Hey @PhilipAnthonyMurray Thanks for your interest in Polly and for alerting this issue originally!

We pushed this fix out to keep Polly moving, and because the unit tests gave us clear visibility of the issue and fix. However, very happy still to hear your real world feedback, and here still for any further assistance you may need!

Thanks

PhilipAnthonyMurray commented 8 years ago

@reisenberger Sorry for not being able to respond on this, things have been very busy. I plan on getting this tested early next week and will respond when complete.

reisenberger commented 8 years ago

@PhilipAnthonyMurray Just reporting that new onRetryAsync delegates test good for us in the wild (no surprise but :smile:). Just tested by adapting the most complex example in Polly-Samples.

Nevertheless happy for any feedback when you get this in place in your solution.

PhilipAnthonyMurray commented 8 years ago

@reisenberger I have tested the fix and confirmed that it works. Really sorry for the delay in getting this tested and confirmed as working.

reisenberger commented 8 years ago

@PhilipAnthonyMurray Great. We were confident now this was watertight from our own tests, but great to have confirmation it's fixed your issue too!