Closed chlowell closed 2 years ago
I think it should be enough to return errors if there are errors at the end of NewDefaultAzureCredential: https://github.com/Azure/azure-sdk-for-go/blob/main/sdk/azidentity/default_azure_credential.go#L76-L85
How does that sound? I could be missing something š¤
I worry that would imply the credential isn't useable. I expect applications to do this:
cred, err := NewDefaultAzureCredential(nil)
if err != nil {
// assume cred is invalid
What I'm looking for here is more detailed log output when DefaultAzureCredential.GetToken fails. Every component of the default chain should be represented so a user who expected e.g. EnvironmentCredential to work gets some hint about why it didn't.
Iāll explore some ideas next week before I make a PR! Iāll continue to comment here if I find anything interesting.
I do believe that it is natural in Go to have good results besides errors. This is a fundamental part of the language that should be used as an advantage.
AWS has custom error types with partial results: https://github.com/aws/aws-sdk-go/blob/main/service/s3/s3manager/upload.go#L44-L52
I donāt see a clear guide stating that valid results should not be sent in case of partial errors.
I will focus my efforts on logging for now in any case.
The error message returned from GetToken is also important, probably more so than logging.
I still think we shouldn't return a non-nil error when the credential might work. Doing so would burden developers with figuring out whether the error matters. I imagine that would involve logic like "if the application is deployed to a VM, ignore the error unless it's about managed identity", which erodes the value of DefaultAzureCredential.
Letās say DefaultAzureCredential
is deployed in a VM and it fails, checking the right error should be as easy as if errors.Is(err, ManagedIdentityCredentialError) {
.
Moreover, without a concrete error for success cases, users wonāt be able to deploy programs reacting to specific authentication methods if they deploy to environments where more than one authentication is available.
Letās say they deploy a DefaultAzureCredential
to a VM where they also configured the environment variables used for EnvironmentCredential
. They might not have intended to do this, but they will be left with a credential potentially authenticated with the wrong account. In some cases, I believe they might want to keep this setup; then their only recourse is to switch to the individual credentials.
If thatās where we want to go (forcing developers to use individual credentials if DefaultAzureCredential is not enough for them), then thatās ok, of course.
Letās say
DefaultAzureCredential
is deployed in a VM and it fails, checking the right error should be as easy asif errors.Is(err, ManagedIdentityCredentialError) {
.Well, there's no ManagedIdentityCredentialError. But if there were, I think expecting customers to handle spurious constructor errors makes DefaultAzureCredential too painful to use. This:
cred, err := NewDefaultAzureCredential(nil) if err != nil { if deployedToVM { if errors.Is(err, ManagedIdentityCredentialError) { // do something (what should an application do?) } if !errors.Is(err, EnvironmentCredentialError) { log.Println("someone mistakenly set environment variables") cred, err = NewManagedIdentityCredential(nil) ... } else { // do something else (what?) } } }
is tantamount to this, which I would recommend instead:
if deployedToVM { cred, err := NewManagedIdentityCredential(nil) if err != nil { // do something } } else { cred, err := NewEnvironmentCredential(nil) if err != nil { // do something else } }
And we're begging the question of whether recovery is possible, though it's moot at construction anyway because constructors rarely return errors. Usually you learn your credential has a problem when one of your clients asks for a token.
If thatās where we want to go (forcing developers to use individual credentials if DefaultAzureCredential is not enough for them), then thatās ok, of course.
I want to go there. DefaultAzureCredential doesn't serve all purposes. If it doesn't serve yours, that's okay, please use something else š
Iām always torn by DefaultAzureCredential š .. Iāll focus my PR on logging š (I thought the failure error message was ok but Iāll also review it again)
Environment variable configuration illustrates my complaint. When DefaultAzureCredential.GetToken fails, I want the error message to say something about each inner credential, to help developers debug the ones they care about. However, when environment configuration is incomplete, NewEnvironmentCredential()
returns a nil pointer which doesn't go on the chain, so ChainedTokenCredential's error message says nothing about environment variables.
DefaultAzureCredential (DAC) exists for users to more easily get started with our SDKs. Customers should not use it in production environments for the reason you mention above: in a deployment, the app should know exactly what credential it is using and whether it is successful at creating/using it.
During development, the "contract" of DAC is that it is successfully created if it can latch on to any credential so an error should only be surfaced if no valid credential can be found. If a credential is found and then a problem arises, then an error can/should be surfaced but there is practically nothing a customer can do with it at this point. They'd probably have to reconfigure something and then re-run the app. For debugging purposes, it would be nice to indicate what's wrong to help the customer but there is no runtime altered code path that makes sense.
Customers should not use it in production environments
While I agree with that, I am not sure if thatās the intention across languages. Itās showcased as if it should be used on production environments, I believe.
Thank you both for your help! I understand I should focus on logging. Iāll try to have a PR out this week!
So, what about something like this? https://github.com/sadasant/azure-sdk-for-go/commit/68a1d0bd47a87e435899c299c5e749685a94438c
If the idea looks good, I would then polish it, add some tests and post a draft PR.
Please let me know if Iām missing something.
That gets "DefaultAzureCredential" into the message, which helps but doesn't address "missing" credentials. This is the error message I get today from DefaultAzureCredential.GetToken:
Chained Token Credential: Managed Identity Credential: IMDS token request timed outAzure CLI Credential: ERROR: Please run 'az login' to setup account.
If I believed I had set environment configuration, I'd wonder how managed identity and the CLI enter into it. With this issue I'm after something similar to the Python SDK's error, which looks like this:
DefaultAzureCredential failed to retrieve a token from the included credentials.
Attempted credentials:
EnvironmentCredential: Environment variables are not fully configured.
ManagedIdentityCredential: No managed identity endpoint found.
AzureCliCredential: Please run 'az login' to set up an account
Can the error for EnvironmentCredential indicate exactly what env var name it is looking for? Anything to help customers self-diagnose any issues is a significant improvement in customer satisfaction.
It could, but such a message is unwieldy when no variables are set because the credential requires one of 3 sets of overlapping variables. When we last reviewed errors across languages we decided to link to the troubleshooting guide instead (my Python example is pre-troubleshooting guide).
The link is fine with me. Will you add it to the error message?
(Thank you so much for the feedback! Iām still catching up after resuming from the PTO. Iāll give a more detailed answer soon)
@JeffreyRichter yes, the idea is to use aka links to point people to the troubleshooting guide! I can make the troubleshooting guide for go as part of this effort (copying other languages) š
So, I think linking to the troubleshooting guidelines is something I can do right after resolving this specific issue. In that regard, I have a question:
Besides that, I have improved my previous attempt taking in consideration the example that @chlowell shared earlier in this thread: https://github.com/Azure/azure-sdk-for-go/compare/main...sadasant:fix15923?expand=1 How does this look? Iāll ping Charles over teams as well.
Is there an issue regrading whatās missing to align Go with other languages on the usage of the troubleshooting guideline for Identity?
16796 tracks any content revisions to help support agents onboard. There's no issue tracking linking to the guide in error messages.
Besides that, I have improved my previous attempt taking in consideration the example that @chlowell shared earlier in this thread: https://github.com/Azure/azure-sdk-for-go/compare/main...sadasant:fix15923?expand=1 How does this look?
Looks reasonable but let's not add any public API, e.g. ChainedTokenCredentialOptions.ChainedCredentialName
There's no issue tracking linking to the guide in error messages.
I have made a new issue! https://github.com/Azure/azure-sdk-for-go/issues/16843 If that helps š
@chlowell I have made a PR š https://github.com/Azure/azure-sdk-for-go/pull/16844
DefaultAzureCredential only adds to its chain the credentials it can instantiate, and surfaces instantiation failures only when it can't instantiate any credential. That never happens because AzureCLICredential is at the end of the chain. This means that DefaultAzureCredential never surfaces problems with environment configuration or managed identity. Its constructor can always instantiate AzureCLICredential, so it never returns an error, and errors from
DefaultAzureCredential.GetToken()
never mention problems with environment configuration or managed identity because the corresponding credentials aren't part of the chain when such problems exist.