Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.48k stars 4.81k forks source link

[BUG] Azure.Identity {"error":"invalid_request","error_description":"Identity not found"} in Azure DevOps Pipeline after running 1 hour #46308

Open 4865783a5d opened 1 month ago

4865783a5d commented 1 month ago

Library name and version

Azure.Identity 1.12.0 Microsoft.Data.SqlClient 5.1.5

Describe the bug

We have a long running Azure DevOps pipeline which imports data into a Azure Sql instance. After running for 1 hour, an access token refresh is attempted, which fails.

Expected behavior

A new access token is provisioned

Actual behavior

Microsoft.Data.SqlClient.SqlException (0x80131904): DefaultAzureCredential failed to retrieve a token from the included credentials. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/defaultazurecredential/troubleshoot
      - EnvironmentCredential authentication unavailable. Environment variables are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/environmentcredential/troubleshoot
      - WorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/workloadidentitycredential/troubleshoot
      - ManagedIdentityCredential authentication unavailable. The requested identity has not been assigned to this resource.
      Status: 400 (Bad Request)

      Content:
      {"error":"invalid_request","error_description":"Identity not found"}

      Headers:
      Server: IMDS/150.870.65.1475
      x-ms-request-id: xxxxxxxxxx
      Date: Wed, 25 Sep 2024 14:28:43 GMT
      Content-Type: application/json; charset=utf-8
      Content-Length: 68
 at Azure.Identity.ImdsManagedIdentitySource.HandleResponseAsync(Boolean async, TokenRequestContext context, Response response, CancellationToken cancellationToken)
         at Azure.Identity.ManagedIdentitySource.AuthenticateAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken)
         at Azure.Identity.ImdsManagedIdentitySource.AuthenticateAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken)
         at Azure.Identity.ManagedIdentityClient.AuthenticateCoreAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken)
         at Azure.Identity.ManagedIdentityClient.AppTokenProviderImpl(AppTokenProviderParameters parameters)
         at Microsoft.Identity.Client.Internal.Requests.ClientCredentialRequest.SendTokenRequestToAppTokenProviderAsync(ILoggerAdapter logger, CancellationToken cancellationToken)
         at Microsoft.Identity.Client.Internal.Requests.ClientCredentialRequest.GetAccessTokenAsync(CancellationToken cancellationToken, ILoggerAdapter logger)
         at Microsoft.Identity.Client.Internal.Requests.ClientCredentialRequest.ExecuteAsync(CancellationToken cancellationToken)
         at Microsoft.Identity.Client.Internal.Requests.RequestBase.RunAsync(CancellationToken cancellationToken)
         at Microsoft.Identity.Client.ApiConfig.Executors.ConfidentialClientExecutor.ExecuteAsync(AcquireTokenCommonParameters commonParameters, AcquireTokenForClientParameters clientParameters, CancellationToken cancellationToken)
         at Azure.Identity.AbstractAcquireTokenParameterBuilderExtensions.ExecuteAsync[T](AbstractAcquireTokenParameterBuilder`1 builder, Boolean async, CancellationToken cancellationToken)
         at Azure.Identity.MsalConfidentialClient.AcquireTokenForClientCoreAsync(String[] scopes, String tenantId, Boolean enableCae, Boolean async, CancellationToken cancellationToken)
         at Azure.Identity.MsalConfidentialClient.AcquireTokenForClientAsync(String[] scopes, String tenantId, Boolean enableCae, Boolean async, CancellationToken cancellationToken)
         at Azure.Identity.ManagedIdentityClient.AuthenticateAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken)
         at Azure.Identity.ManagedIdentityCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
         at Azure.Identity.CredentialDiagnosticScope.FailWrapAndThrow(Exception ex, String additionalMessage, Boolean isCredentialUnavailable)
         at Azure.Identity.ManagedIdentityCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
         at Azure.Identity.ManagedIdentityCredential.GetTokenAsync(TokenRequestContext requestContext, CancellationToken cancellationToken)
         at Azure.Identity.DefaultAzureCredential.GetTokenFromSourcesAsync(TokenCredential[] sources, TokenRequestContext requestContext, Boolean async, CancellationToken cancellationToken)

Reproduction Steps

      - task: AzureCLI@2
        displayName: "Start Import"
        env:
          ASPNETCORE_ENVIRONMENT: ${{ parameters.aspNetCoreEnvironment }}
        inputs:
          scriptType: "pscore"
          azureSubscription: scn-${{ parameters.resourceGroup }}-arm // ARM Service Connection using workload identity federation with openid connect
          scriptLocation: "inlineScript"
          powerShellErrorActionPreference: "stop"
          failOnStandardError: true
          workingDirectory: "$(Pipeline.Workspace)/dotnet/Import/"
          inlineScript: |
            dotnet Import.dll ${{ parameters.command }} --source "${{ parameters.source }}"

Connection String:

Server=tcp:failovergroup-xxxx.database.windows.net,1433;Initial Catalog=db;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;Authentication=\"Active Directory Default\";

Environment

Self-Hosted Build Agent, Ubuntu 20.04

Current agent version: '3.244.1'
Current image version: 'dev'
Agent running as: 'AzDevOps'
Prepare build directory.
Set build variables.
Download all required tasks.
Downloading task: DownloadBuildArtifacts (1.220.0)
Downloading task: ExtractFiles (1.245.1)
Downloading task: UseDotNet (2.245.1)
Downloading task: AzureCLI (2.245.5)
Checking job knob settings.
   Knob: DockerActionRetries = true Source: $(VSTSAGENT_DOCKER_ACTION_RETRIES) 
   Knob: AgentToolsDirectory = /opt/hostedtoolcache Source: ${AGENT_TOOLSDIRECTORY} 
   Knob: UseGitLongPaths = true Source: $(USE_GIT_LONG_PATHS) 
   Knob: EnableIssueSourceValidation = true Source: $(ENABLE_ISSUE_SOURCE_VALIDATION) 
   Knob: AgentEnablePipelineArtifactLargeChunkSize = true Source: $(AGENT_ENABLE_PIPELINEARTIFACT_LARGE_CHUNK_SIZE) 
   Knob: ContinueAfterCancelProcessTreeKillAttempt = true Source: $(VSTSAGENT_CONTINUE_AFTER_CANCEL_PROCESSTREEKILL_ATTEMPT) 
   Knob: ProcessHandlerSecureArguments = false Source: $(AZP_75787_ENABLE_NEW_LOGIC) 
   Knob: ProcessHandlerSecureArguments = false Source: $(AZP_75787_ENABLE_NEW_LOGIC_LOG) 
   Knob: ProcessHandlerTelemetry = true Source: $(AZP_75787_ENABLE_COLLECT) 
   Knob: UseNewNodeHandlerTelemetry = True Source: $(DistributedTask.Agent.USENEWNODEHANDLERTELEMETRY) 
   Knob: ProcessHandlerEnableNewLogic = true Source: $(AZP_75787_ENABLE_NEW_PH_LOGIC) 
   Knob: EnableResourceMonitorDebugOutput = true Source: $(AZP_ENABLE_RESOURCE_MONITOR_DEBUG_OUTPUT) 
   Knob: IgnoreVSTSTaskLib = true Source: $(AZP_AGENT_IGNORE_VSTSTASKLIB) 
   Knob: FailJobWhenAgentDies = true Source: $(FAIL_JOB_WHEN_AGENT_DIES) 
   Knob: CheckForTaskDeprecation = true Source: $(AZP_AGENT_CHECK_FOR_TASK_DEPRECATION) 
   Knob: LogTaskNameInUserAgent = true Source: $(AZP_AGENT_LOG_TASKNAME_IN_USERAGENT) 
   Knob: UseFetchFilterInCheckoutTask = true Source: $(AGENT_USE_FETCH_FILTER_IN_CHECKOUT_TASK) 
   Knob: Rosetta2Warning = true Source: $(ROSETTA2_WARNING) 
Finished checking job knob settings.
Start tracking orphan processes
github-actions[bot] commented 1 month ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

christothes commented 1 month ago

Hi @4865783a5d - Does the SQL Client ever successfully fetch a credential? Is the DevOps pipeline running on a default host or a custom host VM that you own with managed identity configured? Which credential were you expecting to be selected by DefaultAzureCredential?

github-actions[bot] commented 1 month ago

Hi @4865783a5d. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

4865783a5d commented 1 month ago

Hi @christothes, thanks for your reply.

Yes, an initial token can be fetched and we can successfully access the db during the initial access token duration. When the Azure Identity attempts to renew the token it fails.

Its a self hosted VM Scale Set running an Azure DevOps pipeline task. See the sample task posted above, all our builds use the Service Connection identity (Which is a federated auth ARM connection).

We run over 200 such build tasks (Eg. db migrations with federated auth service connections) on our private infrastructure with no issue if the task completes within 1 hour.

I'm happy to provide build logs over a private channel?

christothes commented 1 month ago

Unfortunately, using the connection string approach to using DefaultAzureCredential doesn't make it easy to enable our logging. Are you able to utilize it via the new AccessTokenCallback feature?

This would allow us to enable logging.

4865783a5d commented 1 month ago

I'll try to setup a minimal reproducible sample with a simple C# Console App and the callback method. I'll get back to you with further details.

github-actions[bot] commented 1 month ago

Hi @4865783a5d. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

4865783a5d commented 1 month ago

I'm on vacation, will provide a sample next week - sorry about the delay.

github-actions[bot] commented 6 days ago

Hi @4865783a5d. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.