Closed lukepuplett closed 2 years ago
By default, we only retry HTTP Status 503 when fetching an access token and that means that we wouldn't attempt a retry with a "No route to host" error.
There might be a couple of things that might be causing this error, that I can think of. A network error, or since you mentioned that you've been using the app for a while, a long lived HttpClient with outdated DNS tables (that if by "use the app for a month" you meant "the app has been running for a month").
System.Net.Http.IHttpClientFactory
, which, btw, would also be the way to go if you create many short lived HttpClients (many custom credentials, or many storage clients for instance). You can read more about that in Use IHttpClientFactory to implement resilient HTTP requests.
If you'd like to integrate System.Net.Http.IHttpClientFactory
, again, share the code in which you are configuring/building your storage client and I can build an example on top of that. Disclaimer: it's not pretty as the underlying libraries used by Google.Cloud.Storage.V1 and Google.Apis.Auth predate .NET Core 2.1, where System.Net.Http.IHttpClientFactory
was introduced, and we had to add support for it without introducing breaking changes. You can take a peek at these integration tests to get an idea of what you'd need to configure and how.Thank you for taking the time to write such a comprehensive reply, Amanda.
When I said, use the app for a month, I meant just testing it and working on features, not to keep the process running for a month. Sorry for the confusion.
My code creates and holds on to a StorageClient
instance for around 1 minute. This is code I've written so as to strike a balance between recreating an instance on every call vs. keeping one alive for the lifetime of the process. That might seem odd, and I've not hit problems either way, just experience makes me habitually treat API client objects in this way.
The StorageClient
is instantiated by your StorageClientBuilder
which is vanilla. The only exception is that on my local Macbook I create the builder with supplied JSON credentials, like so:
return new StorageClientBuilder()
{
JsonCredentials = creds
};
I leave all this vanilla since I assume you folks will make the best default choices for me. So, I assume these days you'd pick a good retry policy that suits your cloud.
That's it. If you have a nice way to configure the builder/client to expect and retry when these "No route to host" errors occur, then that'd be interesting to see.
I have implemented a workaround kind of retry for this error now in my higher blob CRUD class. I've rarely seen this error, having only encountered it a couple of times in the last week, just when testing my app, running it and working on features.
I guess I should also say that I'm new to Mac/'nix and (not being familiar with these error messages) that I initially assumed this was an error within GCP, but after reading your reply it seems obvious now that it's just as likely a problem with my local network stack/env.
In posting this issue I was only looking to inform you of such errors in case you felt there was a hole in your retry policy. If not then there's nothing more to do.
Thanks!!
Thanks for all the context!
So, our default retry policy for auth is not the best for sure. We'll be actually working on making improvements over the next year or so, but we are still on a very early phase for all that work, almost figuring out what to do. Hopefully we'll catch and retry these types of errors in the future and you won't have to worry about them.
Also, I've just noticed that you are still using Google.Cloud.Storage.V1 3.6.0, when Google.Cloud.Storage.V1 4.0.0 does contain new features, in particular we now include metadata for downloaded objects, which I believe was a request you made. I mention this because it will make a difference (for the better) in what you have to do to configure the credential to retry in this case.
For now:
First, build the credential with exception retries (and yes, this is not straightforward either and it's part of what we want to improve)
var credentials = GoogleCredential.FromJson(creds);
var saCredentials = credentials.UnderlyingCredential as ServiceAccountCredential;
var saInitializer = new ServiceAccountCredential.Initializer(saCredentials.Id, saCredentials.TokenServerUrl)
{
ProjectId = saCredentials.ProjectId,
Key = saCredentials.Key,
KeyId = saCredentials.KeyId,
QuotaProject = saCredentials.QuotaProject,
DefaultExponentialBackOffPolicy = ExponentialBackOffPolicy.Exception | ExponentialBackOffPolicy.UnsuccessfulResponse503
};
var withRetriesSaCredential = new ServiceAccountCredential(saInitializer);
var withRetriesGoogleCredential = GoogleCredential.FromServiceAccountCredential(withRetriesSaCredential);
If you continue to use V3.6.0, then you also need to scope the credential and use the builder as follows:
withRetriesGoogleCredential = withRetriesGoogleCredential.CreateScoped(StorageService.Scope.DevstorageFullControl);
var builder = new StorageClientBuilder
{
Credential = withRetriesGoogleCredential,
};
If you upgrade to V4.0.0, then there's no need to scope the credential and you use the builder as follows:
var builder = new StorageClientBuilder
{
GoogleCredential = withRetriesGoogleCredential,
};
Thanks, Amanda. I'll integrate these changes, thanks. I was aware of the latest changes (with my suggestion, thank you) I've just not got around to refactoring yet, and tbh, I might not for a while as this is a "indie hacker" side project and I must choose wisely how I deploy my time :)
I'm going to close this issue as there's nothing for you to do and you now have a little additional context to inform future designs around auth. Thanks again, Luke.
Environment details
Steps to reproduce
Looks like this probably "blip" in GCP isn't being detected as a transient error and being retried, maybe. Or maybe I need to configure something, missing doing something in my storage client instance.