googleapis / google-api-dotnet-client

Google APIs Client Library for .NET
https://developers.google.com/api-client-library/dotnet
Apache License 2.0
1.34k stars 523 forks source link

HttpRequestException: No route to host (oauth2.googleapis.com:443) #2174

Closed lukepuplett closed 2 years ago

lukepuplett commented 2 years ago

Environment details

Steps to reproduce

  1. Use blobs in your app
  2. Use your app for a month :)
  3. Get (un)lucky and observe this exception:
fail: Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware[1]
      An unhandled exception has occurred while executing the request.
      System.Net.Http.HttpRequestException: No route to host (oauth2.googleapis.com:443)
       ---> System.Net.Sockets.SocketException (65): No route to host
         at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
         at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
         at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|277_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
         at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)
         --- End of inner exception stack trace ---
         at Google.Apis.Http.ConfigurableMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
         at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
         at Google.Apis.Auth.OAuth2.Requests.TokenRequestExtenstions.ExecuteAsync(TokenRequest request, HttpClient httpClient, String tokenServerUrl, CancellationToken taskCancellationToken, IClock clock, ILogger logger)
         at Google.Apis.Auth.OAuth2.ServiceAccountCredential.RequestAccessTokenAsync(CancellationToken taskCancellationToken)
         at Google.Apis.Auth.OAuth2.TokenRefreshManager.RefreshTokenAsync()
         at Google.Apis.Auth.OAuth2.TokenRefreshManager.ResultWithUnwrappedExceptions[T](Task`1 task)
         at Google.Apis.Auth.OAuth2.TokenRefreshManager.<>c.<GetAccessTokenForRequestAsync>b__10_0(Task`1 task)
         at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
         at System.Threading.Tasks.Task.<>c.<.cctor>b__272_0(Object obj)
         at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
      --- End of stack trace from previous location ---
         at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
         at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
      --- End of stack trace from previous location ---
         at Google.Apis.Auth.OAuth2.TokenRefreshManager.GetAccessTokenForRequestAsync(CancellationToken cancellationToken)
         at Google.Apis.Auth.OAuth2.ServiceAccountCredential.GetAccessTokenForRequestAsync(String authUri, CancellationToken cancellationToken)
         at Google.Apis.Auth.OAuth2.ServiceCredential.GetAccessTokenWithHeadersForRequestAsync(String authUri, CancellationToken cancellationToken)
         at Google.Apis.Auth.OAuth2.ServiceCredential.InterceptAsync(HttpRequestMessage request, CancellationToken cancellationToken)
         at Google.Apis.Http.ConfigurableMessageHandler.CredentialInterceptAsync(HttpRequestMessage request, CancellationToken cancellationToken)
         at Google.Apis.Http.ConfigurableMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
         at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
         at Google.Apis.Requests.ClientServiceRequest`1.ExecuteUnparsedAsync(CancellationToken cancellationToken)
         at Google.Apis.Requests.ClientServiceRequest`1.ExecuteAsync(CancellationToken cancellationToken)
         at Google.Api.Gax.Rest.ResponseAsyncEnumerable`3.ResponseAsyncEnumerator.MoveNextAsync()
         at Google.Api.Gax.Rest.ResourceEnumerator`3.MoveNextAsync()
         at Customer's stack in their app redacted.
         at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeInnerFilterAsync>g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|25_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
         at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
         at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
         at Microsoft.AspNetCore.Authorization.Policy.AuthorizationMiddlewareResultHandler.HandleAsync(RequestDelegate next, HttpContext context, AuthorizationPolicy policy, PolicyAuthorizationResult authorizeResult)
         at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
         at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
         at Microsoft.AspNetCore.ResponseCompression.ResponseCompressionMiddleware.InvokeCore(HttpContext context)
         at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)

Looks like this probably "blip" in GCP isn't being detected as a transient error and being retried, maybe. Or maybe I need to configure something, missing doing something in my storage client instance.

amanda-tarafa commented 2 years ago

By default, we only retry HTTP Status 503 when fetching an access token and that means that we wouldn't attempt a retry with a "No route to host" error.

There might be a couple of things that might be causing this error, that I can think of. A network error, or since you mentioned that you've been using the app for a while, a long lived HttpClient with outdated DNS tables (that if by "use the app for a month" you meant "the app has been running for a month").

lukepuplett commented 2 years ago

Thank you for taking the time to write such a comprehensive reply, Amanda.

When I said, use the app for a month, I meant just testing it and working on features, not to keep the process running for a month. Sorry for the confusion.

My code creates and holds on to a StorageClient instance for around 1 minute. This is code I've written so as to strike a balance between recreating an instance on every call vs. keeping one alive for the lifetime of the process. That might seem odd, and I've not hit problems either way, just experience makes me habitually treat API client objects in this way.

The StorageClient is instantiated by your StorageClientBuilder which is vanilla. The only exception is that on my local Macbook I create the builder with supplied JSON credentials, like so:

                return new StorageClientBuilder()
                {
                    JsonCredentials = creds
                };

I leave all this vanilla since I assume you folks will make the best default choices for me. So, I assume these days you'd pick a good retry policy that suits your cloud.

That's it. If you have a nice way to configure the builder/client to expect and retry when these "No route to host" errors occur, then that'd be interesting to see.

I have implemented a workaround kind of retry for this error now in my higher blob CRUD class. I've rarely seen this error, having only encountered it a couple of times in the last week, just when testing my app, running it and working on features.

I guess I should also say that I'm new to Mac/'nix and (not being familiar with these error messages) that I initially assumed this was an error within GCP, but after reading your reply it seems obvious now that it's just as likely a problem with my local network stack/env.

In posting this issue I was only looking to inform you of such errors in case you felt there was a hole in your retry policy. If not then there's nothing more to do.

Thanks!!

amanda-tarafa commented 2 years ago

Thanks for all the context!

So, our default retry policy for auth is not the best for sure. We'll be actually working on making improvements over the next year or so, but we are still on a very early phase for all that work, almost figuring out what to do. Hopefully we'll catch and retry these types of errors in the future and you won't have to worry about them.

Also, I've just noticed that you are still using Google.Cloud.Storage.V1 3.6.0, when Google.Cloud.Storage.V1 4.0.0 does contain new features, in particular we now include metadata for downloaded objects, which I believe was a request you made. I mention this because it will make a difference (for the better) in what you have to do to configure the credential to retry in this case.

For now:

First, build the credential with exception retries (and yes, this is not straightforward either and it's part of what we want to improve)

var credentials = GoogleCredential.FromJson(creds);
var saCredentials = credentials.UnderlyingCredential as ServiceAccountCredential;
var saInitializer = new ServiceAccountCredential.Initializer(saCredentials.Id, saCredentials.TokenServerUrl)
{
    ProjectId = saCredentials.ProjectId,
    Key = saCredentials.Key,
    KeyId = saCredentials.KeyId,
    QuotaProject = saCredentials.QuotaProject,
    DefaultExponentialBackOffPolicy = ExponentialBackOffPolicy.Exception | ExponentialBackOffPolicy.UnsuccessfulResponse503
};
var withRetriesSaCredential = new ServiceAccountCredential(saInitializer);
var withRetriesGoogleCredential = GoogleCredential.FromServiceAccountCredential(withRetriesSaCredential);

If you continue to use V3.6.0, then you also need to scope the credential and use the builder as follows:

withRetriesGoogleCredential = withRetriesGoogleCredential.CreateScoped(StorageService.Scope.DevstorageFullControl);
var builder = new StorageClientBuilder
{
    Credential = withRetriesGoogleCredential,
};

If you upgrade to V4.0.0, then there's no need to scope the credential and you use the builder as follows:

var builder = new StorageClientBuilder
{
    GoogleCredential = withRetriesGoogleCredential,
};
lukepuplett commented 2 years ago

Thanks, Amanda. I'll integrate these changes, thanks. I was aware of the latest changes (with my suggestion, thank you) I've just not got around to refactoring yet, and tbh, I might not for a while as this is a "indie hacker" side project and I must choose wisely how I deploy my time :)

I'm going to close this issue as there's nothing for you to do and you now have a little additional context to inform future designs around auth. Thanks again, Luke.