Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.46k stars 4.8k forks source link

[BUG]Job in AzureML Workspace can't authenticate with Azure.Identity 1.13.* #46932

Open lenniehwtw-new opened 1 day ago

lenniehwtw-new commented 1 day ago

Library name and version

Azure.Identity 1.13.*

Describe the bug

We have a .NET pipleine job in AzureMl workspace that works just fine with Azure.Identity 1.12.1 But if we upgrade to 1.3.* the job fails to authenticate. We were using DefaultAzureCredential() and having read the breaking changes for 1.13.0 I changed this to use a chained credential like this

var credentials = new ChainedTokenCredential(new EnvironmentCredential(), new ManagedIdentityCredential(azureClientId), new AzureCliCredential());

This works fine with 1.12.1, in this particular context we would expect it to pick up the ManagedIdentityCredential, the method is called in other contexts too, which is why the EnvironmentCredential is in the chain.

The error we see in the logs in AzureML looks like this:

Unhandled exception. Azure.Identity.AuthenticationFailedException: ManagedIdentityCredential authentication failed: [Managed Identity] The error response was either empty or could not be parsed.. Error response received from the server: Invalid secret token header: . See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/managedidentitycredential/troubleshoot ---> MSAL.NetCore.4.66.1.0.MsalServiceException: ErrorCode: managed_identity_request_failed Microsoft.Identity.Client.MsalServiceException: [Managed Identity] The error response was either empty or could not be parsed.. Error response received from the server: Invalid secret token header: . at Microsoft.Identity.Client.ManagedIdentity.AbstractManagedIdentity.HandleResponseAsync(AcquireTokenForManagedIdentityParameters parameters, HttpResponse response, CancellationToken cancellationToken) at Microsoft.Identity.Client.ManagedIdentity.AbstractManagedIdentity.AuthenticateAsync(AcquireTokenForManagedIdentityParameters parameters, CancellationToken cancellationToken) at Microsoft.Identity.Client.Internal.Requests.ManagedIdentityAuthRequest.SendTokenRequestForManagedIdentityAsync(ILoggerAdapter logger, CancellationToken cancellationToken) at Microsoft.Identity.Client.Internal.Requests.ManagedIdentityAuthRequest.GetAccessTokenAsync(CancellationToken cancellationToken, ILoggerAdapter logger) at Microsoft.Identity.Client.Internal.Requests.ManagedIdentityAuthRequest.ExecuteAsync(CancellationToken cancellationToken) at Microsoft.Identity.Client.Internal.Requests.RequestBase.<>cDisplayClass11_1.<b1>d.MoveNext() --- End of stack trace from previous location --- at Microsoft.Identity.Client.Utils.StopwatchService.MeasureCodeBlockAsync(Func1 codeBlock) at Microsoft.Identity.Client.Internal.Requests.RequestBase.RunAsync(CancellationToken cancellationToken) at Microsoft.Identity.Client.ApiConfig.Executors.ManagedIdentityExecutor.ExecuteAsync(AcquireTokenCommonParameters commonParameters, AcquireTokenForManagedIdentityParameters managedIdentityParameters, CancellationToken cancellationToken) at Azure.Identity.MsalManagedIdentityClient.AcquireTokenForManagedIdentityAsyncCore(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) at Azure.Identity.MsalManagedIdentityClient.AcquireTokenForManagedIdentity(TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Identity.ManagedIdentityClient.AuthenticateAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken) at Azure.Identity.ManagedIdentityCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken) StatusCode: 403 ResponseBody:
Headers: --- End of inner exception stack trace --- at Azure.Identity.CredentialDiagnosticScope.FailWrapAndThrow(Exception ex, String additionalMessage, Boolean isCredentialUnavailable) at Azure.Identity.ManagedIdentityCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) at Azure.Identity.ManagedIdentityCredential.GetToken(TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Identity.DefaultAzureCredential.GetTokenFromSourcesAsync(TokenCredential[] sources, TokenRequestContext requestContext, Boolean async, CancellationToken cancellationToken) at Azure.Identity.DefaultAzureCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Identity.CredentialDiagnosticScope.FailWrapAndThrow(Exception ex, String additionalMessage, Boolean isCredentialUnavailable) at Azure.Identity.DefaultAzureCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) at Azure.Identity.DefaultAzureCredential.GetToken(TokenRequestContext requestContext, CancellationToken cancellationToken) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.AccessTokenCache.SetResultOnTcsFromCredentialAsync(TokenRequestContext context, TaskCompletionSource1 targetTcs, Boolean async, CancellationToken cancellationToken) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.AccessTokenCache.GetAuthHeaderValueAsync(HttpMessage message, TokenRequestContext context, Boolean async) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](Task1 task) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.AccessTokenCache.TokenRequestState.GetCurrentHeaderValue(Boolean async, Boolean checkForCompletion, CancellationToken cancellationToken) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.AccessTokenCache.GetAuthHeaderValueAsync(HttpMessage message, TokenRequestContext context, Boolean async) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.AuthenticateAndAuthorizeRequest(HttpMessage message, TokenRequestContext context) at Azure.Security.KeyVault.ChallengeBasedAuthenticationPolicy.AuthorizeRequestOnChallengeAsyncInternal(HttpMessage message, Boolean async) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) at Azure.Security.KeyVault.ChallengeBasedAuthenticationPolicy.AuthorizeRequestOnChallenge(HttpMessage message) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory1 pipeline, Boolean async) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted(ValueTask task) at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.Process(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelinePolicy.ProcessNext(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.RedirectPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory1 pipeline, Boolean async) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted(ValueTask task) at Azure.Core.Pipeline.RedirectPolicy.Process(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelinePolicy.ProcessNext(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.RetryPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory1 pipeline, Boolean async) at Azure.Core.Pipeline.RetryPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory1 pipeline, Boolean async) at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted(ValueTask task) at Azure.Core.Pipeline.RetryPolicy.Process(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelinePolicy.ProcessNext(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelineSynchronousPolicy.Process(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelinePolicy.ProcessNext(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelineSynchronousPolicy.Process(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelinePolicy.ProcessNext(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipelineSynchronousPolicy.Process(HttpMessage message, ReadOnlyMemory1 pipeline) at Azure.Core.Pipeline.HttpPipeline.Send(HttpMessage message, CancellationToken cancellationToken) at Azure.Core.Pipeline.HttpPipeline.SendRequest(Request request, CancellationToken cancellationToken) at Azure.Security.KeyVault.KeyVaultPipeline.SendRequest(Request request, CancellationToken cancellationToken) at Azure.Security.KeyVault.KeyVaultPipeline.GetPage[T](Uri firstPageUri, String nextLink, Func1 itemFactory, String operationName, CancellationToken cancellationToken) at Azure.Security.KeyVault.Secrets.SecretClient.<>c
DisplayClass15_0.b
0(String nextLink) at Azure.Core.PageResponseEnumerator.<>c__DisplayClass0_01.<CreateEnumerable>b__0(String continuationToken, Nullable1 pageSizeHint) at Azure.Core.PageResponseEnumerator.FuncPageable1.AsPages(String continuationToken, Nullable1 pageSizeHint)+MoveNext() at Azure.Pageable1.GetEnumerator()+MoveNext() at Azure.Extensions.AspNetCore.Configuration.Secrets.AzureKeyVaultConfigurationProvider.Load() at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList1 providers) at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build() at Microsoft.Extensions.Hosting.HostBuilder.InitializeAppConfiguration() at Microsoft.Extensions.Hosting.HostBuilder.Build() at Program.

$(String[] args) in /build/src/Wtw.Model.Deployment.Worker/Program.cs:line 4 at Program.
(String[] args)

Expected behavior

We would expect the code that works with Azure.Identity 1.12.1 to still work

Actual behavior

see above in Bug description

Reproduction Steps

Run a .NET executable as a pipeline job in AzureML workspace

Environment

The job is a pipeline job that runs in AzureML workspace and is responsible for deploying endpoints

github-actions[bot] commented 1 day ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

christothes commented 1 day ago

Hi @lenniehwtw-new Would you mind providing the logging output (with any secrets redacted) after reproducing this with logging enabled?

github-actions[bot] commented 1 day ago

Hi @lenniehwtw-new. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

lenniehwtw-new commented 1 day ago

My pipeline job is running now, will return shortly with the requested logs.....

lenniehwtw-new commented 1 day ago

Here is the std out log with EventLevel.Verbose

std_log.txt

christothes commented 1 day ago

Hi @lenniehwtw-new - From the logging it appears that based on the definition of the MSI_ENDPOINT environment variable, it is attempting to authenticate as if we are in a CloudShell environment.

[Informational] Azure-Identity: False MSAL 4.66.1.0 MSAL.NetCore .NET 8.0.10 Linux [2024-10-31 19:55:02Z - 4201235e-4763-44a5-b901-df304eab9aad] [Managed Identity] Cloud shell managed identity is available.
[Verbose] Azure-Identity: False MSAL 4.66.1.0 MSAL.NetCore .NET 8.0.10 Linux [2024-10-31 19:55:02Z - 4201235e-4763-44a5-b901-df304eab9aad] [Managed Identity] Creating cloud shell managed identity. Endpoint URI: http://localhost:46809/MSI/token
<...>
[Verbose] Azure-Identity: False MSAL 4.66.1.0 MSAL.NetCore .NET 8.0.10 Linux [2024-10-31 19:55:02Z - 4201235e-4763-44a5-b901-df304eab9aad] [HttpManager] Sending request. Method: POST. Host: http://localhost:46809. Binding Certificate: False 
[Informational] Azure-Core: Request [e5731ca7-155d-4254-8c0a-d92d038c4494] POST http://localhost:46809/MSI/token
ContentType:REDACTED
Metadata:REDACTED
Content-Type:application/x-www-form-urlencoded
x-ms-client-request-id:e5731ca7-155d-4254-8c0a-d92d038c4494
x-ms-return-client-request-id:true
User-Agent:azsdk-net-Identity/1.13.1 (.NET 8.0.10; Debian GNU/Linux 12 (bookworm))
client assembly: Azure.Identity
[Warning] Azure-Core: Error response [e5731ca7-155d-4254-8c0a-d92d038c4494] 403 Forbidden (00.0s)
Date:Thu, 31 Oct 2024 19:55:02 GMT
Content-Type:text/plain; charset=utf-8
Content-Length:29

Could you gather some logs from the older version that works so that we can compare? I'm curious if it attempted the same http://localhost:46809/MSI/token endpoint.

github-actions[bot] commented 1 day ago

Hi @lenniehwtw-new. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

lenniehwtw-new commented 12 hours ago

std_log.txt

Attached is the logs from exactly the same code with Azure.Identity 1.12.1

christothes commented 6 hours ago

Thanks @lenniehwtw-new - This appears to be a regression introduced by our underlying dependency Microsoft.Identity.Client (MSAL). Could you please create an issue over in their repo and link this issue for context?

The problem seems to be that Azure ML Studio uses similar environment variables to other hosting environments such as Azure App Service. In our previous implementation, we attempted to detect Azure App Service's environment variables before CloudShell. Because CloudShell uses a subset of the environment variables, it will be wrongly detected if it is attempted first in an AzureML environment and fails to add the value of the MSI_SECRET env var in a required header named secret.

lenniehwtw-new commented 5 hours ago

Issue raised MSAL Issue 4984