microsoft / azure-pipelines-agent

Azure Pipelines Agent 🚀
MIT License
1.7k stars 857 forks source link

[BUG]: Azure pipeline credentials expire unexpectedly before job's maximum duration #4736

Open fabrideci opened 3 months ago

fabrideci commented 3 months ago

What happened?

Our Azure DevOps YAML pipeline utilizes an ARM service connection with workload identity federation via OpenID Connect for Azure access. The pipeline operates smoothly until approximately the 55th minute mark, at which point the environment credentials seem to vanish, as indicated by the error output below. Although we recognize the default 60-minute job timeout, it’s unclear if this issue is related to that timeout or if it’s associated with the OIDC token’s lifecycle.

EDIT: actually, since we've bought 3 Microsoft-hosted parallel jobs, according to your documentation each job should have the capacity to run for up to 360 minutes (6 hours). Hence, it's definitely an issue if the underlying environment credentials vanish just after ~60 minutes by default.

Versions

3.236.1 / Ubuntu

Environment type (Please select at least one enviroment where you face this issue)

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

No response

Operation system

Ubuntu 22.04.4 LTS

Version controll system

No response

Relevant log output

AggregateAuthenticationError: ChainedTokenCredential authentication failed.
CredentialUnavailableError: EnvironmentCredential is unavailable. No underlying credential could be used. To troubleshoot, visit https://aka.ms/azsdk/js/identity/environmentcredential/troubleshoot.
CredentialUnavailableError: WorkloadIdentityCredential: is unavailable. tenantId, clientId, and federatedTokenFilePath are required parameters. 
      In DefaultAzureCredential and ManagedIdentityCredential, these can be provided as environment variables - 
      "AZURE_TENANT_ID",
      "AZURE_CLIENT_ID",
      "AZURE_FEDERATED_TOKEN_FILE". See the troubleshooting guide for more information: https://aka.ms/azsdk/js/identity/workloadidentitycredential/troubleshoot  
CredentialUnavailableError: ManagedIdentityCredential: The managed identity endpoint is indicating there's no available identity. Message: invalid_request Status code: 400
fabrideci commented 3 months ago

@vmapetr

UPDATE: actually, since we've bought 3 Microsoft-hosted parallel jobs, according to your documentation each job should have the capacity to run for up to 360 minutes (6 hours). Hence, it's definitely an issue if the underlying environment credentials vanish just after ~60 minutes by default.

vmapetr commented 3 months ago

Hi @fabrideci thanks for reporting! We are working on more prioritized issues at the moment, but will get back to this one soon.