Open DamianEdwards opened 9 months ago
cc: @martintmk @martincostello @geeknoid
The easy enhancement is to improve the HttpClientResiliencePredicates
to also detect gRPC calls and handle retriable status codes:
This should make both retry and circuit breaker strategy work for gRPC. The other issue is handling of streamed calls, which I am not sure how to address.
gRPC always return 200 status code. Failure is communicated in grpc-status trailer.
I haven't looked at how resilience works, but I'm guessing the retry happens inside a HTTP handler's SendAsync. gRPC supports streaming an error can occur long after response status is returned and SendAsync has run.
I think a known limitation will be that streaming gRPC calls won't be retried. However, failing unary calls should be detectable. Look for a 200 status code and also check the response headers for grpc-status. They will both be available in SendAsync.
Failure is communicated in grpc-status trailer.
The trailer is available only after the response body is finished reading, is that correct? I am wondering how we can ensure that trailer is available for gRPC calls. Otherwise, the retries won't work.
Will buffering the content work?
Will buffering the content work?
No.
If an error happens before any content is returned by the server, then grpc-status
is in the headers. That is the scenario that will work. It's confusingly named Trailers-Only
in the spec - https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#responses
ITNOA
Any plan to implement specific extensions for support gRPC in Microsoft.Extensions.Resilience?
thanks
Any plan to implement specific extensions for support gRPC in Microsoft.Extensions.Resilience?
That is what this issue is tracking. No committed timelines yet, so for now we just want to continue this discussion.
The HTTP resiliency features, including those added by the
IHttpClientBuilder.AddStandardResilienceHandler
method, don't apply to gRPC calls despite them going through configuredHttpClient
instances. This is due to the gRPC stack not exposing error details at the HTTP request level in the way that the resiliency features expect (e.g. using HTTP status codes).The following code example, typical of setting up a gRPC client in a .NET server application, will not actually result in the standard resiliency features being applied to gRPC calls:
Consider adding support for the standard resiliency patterns to the .NET gRPC client stack in a similar fashion to those added to the
HttpClient
stack so that resiliency features like Circuit Breaker can be easily added by default./Cc @JamesNK @davidfowl