Open jiasli opened 2 months ago
refresh OIDC token is a feature
Different external identity providers (IdP) have different ways of retrieving the ID token:
ACTIONS_ID_TOKEN_REQUEST_URL
and ACTIONS_ID_TOKEN_REQUEST_TOKEN
and requires a GET
HTTP request: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-cloud-providers#requesting-the-jwt-using-environment-variablesPOST
HTTP request: https://learn.microsoft.com/en-us/rest/api/azure/devops/distributedtask/oidctoken/createI had a discussion with MSAL team today and proposed 2 possible callback interfaces:
getidtoken
that returns an ID token in stdout
, then instead of providing --federated-token <ID token>
to az login
, they should provide --federated-token-callback getidtoken
to az login
, so that CLI and MSAL can actively retrieve an ID token with getidtoken
when ID token expires. This is very similar to how Azure Identity's AzureCliCredential
retrieves access tokens from Azure CLI by subprocessing az account get-access-token
.ID_TOKEN_REQUEST_URL
.[!WARNING] This mitigation doesn't work with Azure CLI 2.59.0. See https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049400226.
ID token: |----| 10 min
Access token 1: |------------------------| 60 min
Access token 2: | 20 min: ERROR: ID token expired
An ID token lasts for 5 minutes on GitHub Actions and 10 minutes on Azure DevOps, but an access token lasts for 60 minutes.
When you run az login
, Azure CLI only acquires access tokens for ARM, using https://management.core.windows.net//.default
as the scope.
After the ID token expires, if acquiring an access token for other scopes, such as
az account get-access-token --scope https://kusto.kusto.windows.net//.default
as currently there is no access token for that scope in the token cache, Azure CLI/MSAL will try to get an access token with the ID token. However, as the ID token has expired, the command fails with AADSTS700024
.
So, the mitigation is pretty straightforward: Acquire all access tokens before the ID token expires.
You have to know which scopes are used in your pipeline task and call az account get-access-token --scope ...
immediately after az login
. This makes Azure CLI/MSAL acquire access tokens for the specified scopes while the ID token is still valid and save them in the token cache.
For example:
az account get-access-token --scope https://storage.azure.com/.default --output none
az account get-access-token --scope https://vault.azure.net/.default --output none
az account get-access-token --scope https://graph.microsoft.com//.default --output none
az account get-access-token --scope https://kusto.kusto.windows.net//.default --output none
[!WARNING] Even though GitHub Actions can mask the access token as
***
inaz account get-access-token
's output:+ az account get-access-token *** "accessToken": "***", "expiresOn": "2024-04-10 14:11:25.000000", "expires_on": 1712758285, "subscription": "...", "tenant": "...", "tokenType": "Bearer" ***
You MUST specify
--output none
to make sure no access token is printed to any of your logs.
Then subsequence commands using these scopes will use the access tokens saved in the token cache, so that they won't fail after the ID token expires, but they will still fail after the access token expires (60 minutes).
I tried fixing the issue with provided mitigation but it is still persistent, maybe I'm doing something wrong? My workflow contains actions which use NodeJS tests in which I verify connections to ServiceBus. As OIDC is used I login to azure with azure/login@v2 action:
- name: Azure login
uses: azure/login@v2
with:
client-id: ${{ env.AZURE_CLIENT_ID }}
tenant-id: ${{ env.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}
enable-AzPSSession: false
After that I added step to mitigate the issue:
- name: Azure get token
uses: azure/cli@v2
with:
inlineScript: |
az account get-access-token --scope https://storage.azure.com/.default --output none
az account get-access-token --scope https://servicebus.azure.net/.default
But after ~10 minutes Im still getting:
AggregateAuthenticationError: ChainedTokenCredential authentication failed.
CredentialUnavailableError: Please run 'az login' from a command prompt to authenticate before using this credential.
CredentialUnavailableError: WorkloadIdentityCredential: is unavailable. tenantId, clientId, and federatedTokenFilePath are required parameters.
In DefaultAzureCredential and ManagedIdentityCredential, these can be provided as environment variables -
"AZURE_TENANT_ID",
"AZURE_CLIENT_ID",
"AZURE_FEDERATED_TOKEN_FILE". See the troubleshooting guide for more information: https://aka.ms/azsdk/js/identity/workloadidentitycredential/troubleshoot
Did I miss something? I use https://www.npmjs.com/package/@azure/service-bus
Thanks for the mitigation @jiasli.
However, I don't think I'm hitting the issue where the Azure CLI tries to acquire an access token for a difference audience after the ID token has expired.
I'm fairly confident that the az
commands I use only use the access token for ARM:
az account set
az deployment sub create
az deployment sub show
az webapp deployment slot swap
az webapp deployment source config-zip
az webapp start
az webapp stop
The general flow is:
The time it takes to swap slots varies greatly, however more than 5 minutes have always elapsed by the time it's done.
Now, what is strange is that stopping the slot sometimes work, and sometimes doesn't, dependending on how much time has passed since we ran azure/login
.
To me, it sounds like the access token expires "quicker" than before. Could that be?
Edit: I checked across many workflow runs, and to me it looks like the access token expires after 10 minutes.
@Kapsztajn, I can successfully get an access token for https://servicebus.azure.net/.default
locally which lasts for 4600s.
> az account get-access-token --scope https://servicebus.azure.net/.default
{
"accessToken": "...",
"expiresOn": "2024-04-11 13:57:35.000000",
"expires_on": 1712815055,
"subscription": "0b1f6471-1bf0-4dda-aec3-cb9272f09590",
"tenant": "54826b22-38d6-4fb2-bad9-b7b93a3e9c5a",
"tokenType": "Bearer"
}
Decoded claims:
"iat": 1712810455,
"nbf": 1712810455,
"exp": 1712815055,
I am not entirely sure why this line is printed:
CredentialUnavailableError: Please run 'az login' from a command prompt to authenticate before using this credential.
The Azure Service Bus client library for JavaScript SDK also didn't fail with AADSTS700024
. I am not an expert of that SDK. Is it possible to collect more details on which scope the SDK requests, and why it fails with that error?
@mderriey, this seems odd as all these operations are indeed ARM operations. Could you check the actual expiration time of the access token issued for ARM?
> az account get-access-token --scope https://management.core.windows.net//.default --query expiresOn --output tsv
2024-04-11 13:47:47.000000
Hi @Kapsztajn, the suggested mitigation did not work for me as well. It was able to fetch the token with an expiry that was reasonable, but I was able to see the same error once the OID token expired after 5 mins.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry. I was able to get this working and here is what I did: I inserted the following step in my workflow just before the step where this token expiry issue was popping:
- name: Fetch OID token every 4 mins
run: |
while true; do
token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN
token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL
token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r)
az login --service-principal -u ${{ secrets.CLIENT_ID }} -t ${{ secrets.TENANT_ID }} --federated-token $token --output none
# Sleep for 4 minutes
sleep 240
done &
Could you try this out and see if this works for you as well?
@mderriey, this seems odd as all these operations are indeed ARM operations. Could you check the actual expiration time of the access token issued for ARM?
> az account get-access-token --scope https://management.core.windows.net//.default --query expiresOn --output tsv 2024-04-11 13:47:47.000000
Good suggestion @jiasli , thanks.
Here's what I ran:
steps:
- name: Login to Azure
uses: azure/login@v2
with:
client-id: ${{ env.oidcAppRegistrationClientId }}
tenant-id: ${{ env.azureTenantId }}
allow-no-subscriptions: true
enable-AzPSSession: true
- name: Check token expiry
shell: bash
run: |
echo "Current date: $(date '+%Y-%m-%dT%H:%M:%S')"
echo "Token expiration: $(az account get-access-token --resource-type arm --query expiresOn --output tsv --debug)"
echo "Token AzureAD/microsoft-authentication-library-for-python#2 expiration: $(az account get-access-token --resource-type arm --query expiresOn --output tsv --debug)"
And the output (debug output omitted):
Current date: 2024-04-11T06:57:14
Token expiration: 2024-04-11 07:57:14.000000
Token AzureAD/microsoft-authentication-library-for-python#2 expiration: 2024-04-11 07:57:14.000000
So the token is valid for 1 hour.
And both calls to az account get-access-token
show this in the debug output, which I think confirms that the ARM token is cached and was originally acquired during az login
:
DEBUG: msal.token_cache: event={
"client_id": "***",
"data": {
"claims": "{\"access_token\": {\"xms_cc\": {\"values\": [\"CP1\"]}}}",
"scope": [
"https://management.core.windows.net//.default"
]
},
"environment": "login.microsoftonline.com",
"grant_type": "client_credentials",
"params": null,
"response": {
"access_token": "********",
"expires_in": 3599,
"ext_expires_in": 3599,
"token_type": "Bearer"
},
"scope": [
"https://management.core.windows.net//.default"
],
"token_endpoint": "https://login.microsoftonline.com/<redacted>/oauth2/v2.0/token"
}
I'm not sure what happens, then...
I'll try removing the extra azure/login
steps when I get some more time to see if the issue disappears.
Thanks again, let me know if I can perform some more testing if anything comes to mind. If you'd be interested in the debug output, I could send that privately.
Apologize for the confusion caused.
As I tested today, the mitigation I provided in https://github.com/Azure/azure-cli/issues/28708#issuecomment-2047256166 stopped working for Azure CLI 2.59.0, because of an MSAL regression introduced in 1.27.0 (https://github.com/AzureAD/microsoft-authentication-extensions-for-python/issues/127, https://github.com/AzureAD/microsoft-authentication-library-for-python/pull/644) which is adopted by Azure CLI 2.59.0 (https://github.com/Azure/azure-cli/pull/28556).
This regression makes MSAL's ConfidentialClientApplication
bypass msal_extensions.token_cache.PersistedTokenCache
, so access tokens are no longer retrieved from the token cache. Instead, every command now retrieves a new access token from the AAD Security Token Service (STS). In fact, not only the mitigation doesn't work, but even ARM commands fail with AADSTS700024
after the ID token expires.
I will work with MSAL on this issue with high priority.
For now, please keep using service principal secret for authentication to get unblocked: https://github.com/marketplace/actions/azure-login#login-with-a-service-principal-secret
My question is why this has popped up as an issue recently. We've had pipelines run for well over 20 minutes before and never seen this. But within the last week, it seems any workflow using Azure CLI with OIDC federated auth is experiencing this issue.
@iamrk04 It looks like your solution is working and I managed to run test normally (pipeline did run over 16 minutes). I have added code which you provide between Azure login and component test:
- name: Azure login
uses: azure/login@v2
with:
client-id: ${{ env.AZURE_CLIENT_ID }}
tenant-id: ${{ env.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}
enable-AzPSSession: false
- name: Fetch OID token every 4 mins
shell: bash
run: |
while true; do
token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN
token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL
token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r)
az login --service-principal -u ${{ env.AZURE_CLIENT_ID }} -t ${{ env.AZURE_TENANT_ID }} --federated-token $token --output none
# Sleep for 4 minutes
sleep 240
done &
- name: 'Run tests'
shell: bash
...
I had to add shell: bash
because without it I got errors with missing shell.
My question is why this has popped up as an issue recently. We've had pipelines run for well over 20 minutes before and never seen this. But within the last week, it seems any workflow using Azure CLI with OIDC federated auth is experiencing this issue.
@smokedlinq, In my case, it's due to a new version of the GitHub hosted runner image for ubuntu-latest
that was released which has Azure CLI 2.59.0 instead of 2.58.0 for the previous image.
The image went from 20240324.2.0
to 20240407.1.0
.
You can see which image your run uses in the "Set up job" step at the very top.
@mderriey I assumed something like that, I was more referring to how that broke inside of az
.
Hi @Kapsztajn, the suggested mitigation did not work for me as well. It was able to fetch the token with an expiry that was reasonable, but I was able to see the same error once the OID token expired after 5 mins.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry. I was able to get this working and here is what I did: I inserted the following step in my workflow just before the step where this token expiry issue was popping:
- name: Fetch OID token every 4 mins run: | while true; do token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r) az login --service-principal -u ${{ secrets.CLIENT_ID }} -t ${{ secrets.TENANT_ID }} --federated-token $token --output none # Sleep for 4 minutes sleep 240 done &
Could you try this out and see if this works for you as well?
Hey @iamrk04, you're a hero! I inserted this snippet into my workflow, and this made it all work. Great idea to just have that run in the background in a shell loop.
For reference: https://github.com/microsoft/hi-ml/pull/925/
Hi @Kapsztajn, the suggested mitigation did not work for me as well. It was able to fetch the token with an expiry that was reasonable, but I was able to see the same error once the OID token expired after 5 mins.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry. I was able to get this working and here is what I did: I inserted the following step in my workflow just before the step where this token expiry issue was popping:
- name: Fetch OID token every 4 mins run: | while true; do token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r) az login --service-principal -u ${{ secrets.CLIENT_ID }} -t ${{ secrets.TENANT_ID }} --federated-token $token --output none # Sleep for 4 minutes sleep 240 done &
Could you try this out and see if this works for you as well?
Thanks @iamrk04 , this worked for me as well.
Suggestion from @iamrk04 also worked for me. Wrapped it in a github action that potentially can replace azure/login
. I think the solution will even remove the 1 hour limit we had before but have not tested this yet.
name: Azure Federated Login
inputs:
client-id:
description: Azure client id
type: string
tenant-id:
description: Azure tenant id
type: string
subscription-id:
description: Azure subscription id
type: string
default: none
refresh-interval-seconds:
description: Refresh interval in seconds
type: number
default: 240
runs:
using: "composite"
steps:
- name: Fetch OID token every ${{ inputs.refresh-interval-seconds }} seconds
shell: bash
run: |
first_time=true
while true; do
token=$(curl -s -H "Authorization: bearer ${ACTIONS_ID_TOKEN_REQUEST_TOKEN}" "${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=api://AzureADTokenExchange" | jq .value -r)
az login --service-principal -u ${{ inputs.client-id }} -t ${{ inputs.tenant-id }} --federated-token $token --output none
if [ "$first_time" = true ] && [ "${{ inputs.subscription-id }}" != "none" ]; then
az account set -s ${{ inputs.subscription-id }}
first_time=false
fi
sleep ${{ inputs.refresh-interval-seconds }}
done &
I'm running into the same issue in Azure Devops for a pipeline that runs a long python script (2h40m) in an AzureCLI@2 task. Was working fine on Friday (April 5th) but started failing after that with error:
AzureCliCredential: ERROR: AADSTS700024: Client assertion is not within its valid time range. ...
Any ideas on whether an equivalent workaround is possible for Azure Devops to refresh the token every 9 minutes?
We started having problems with the v2.59.0 az cli and rolled back as a workaround. I'm not sure what about the cli release makes this more/less likely to hit this.
My question is why this has popped up as an issue recently. We've had pipelines run for well over 20 minutes before and never seen this. But within the last week, it seems any workflow using Azure CLI with OIDC federated auth is experiencing this issue.
@smokedlinq, please refer to my comment https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049400226.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry.
This workaround https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049014471 proposed by @iamrk04 of periodically calling az login
is not recommended, as Azure CLI doesn't support concurrent execution and you will very likely run into some racing condition (https://github.com/Azure/azure-cli/issues/9427, https://github.com/Azure/azure-cli/issues/20273).
We started having problems with the v2.59.0 az cli and rolled back as a workaround.
This workaround https://github.com/Azure/azure-cli/issues/28708#issuecomment-2050804548 proposed by @dghubble of using an old version is a correct one.
As I suggested in https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049400226, using service principal secret for authentication is also another acceptable workaround.
@jiasli Service principals are unacceptable for some of us as our security certification would require we rotate them on a regular basis. OIDC does not add that additional burden given that they are clearly short lived.
Service principals are unacceptable for some of us as our security certification would require we rotate them on a regular basis. OIDC does not add that additional burden given that they are clearly short lived.
@andre-qumulo, we plan to fix the 5-minute expiration issue in the next version of Azure CLI which will be 2.60.0 and released on 2024-04-30. Using a service principal is only a temporary workaround. Secret rotation usually happens on a monthly basis which is far beyond the time we need to fix it.
I have created a separate issue to track it:
I'm running into the same issue in Azure Devops for a pipeline that runs a long python script (2h40m) in an AzureCLI@2 task. Was working fine on Friday (April 5th) but started failing after that with error:
AzureCliCredential: ERROR: AADSTS700024: Client assertion is not within its valid time range. ...
Any ideas on whether an equivalent workaround is possible for Azure Devops to refresh the token every 9 minutes?
Thanks @jiasli! The mitigation steps for Azure DevOps provided here of using a service principal secret were effective.
(I ran into some trouble finding the organization id while following the instructions but was able to find the organization id with these steps: https://medium.com/@shivapatel1102001/get-list-of-organization-from-azure-devops-microsoft-account-861ea29dae93)
@TomWildenhain, based on my understanding, the steps provided by https://learn.microsoft.com/en-us/azure/devops/pipelines/library/connect-to-azure?view=azure-devops don't require organization ID when creating a service connection using service principal secret. Could you let me know which article you are following?
@jiasli Org id is a 1P policy.
@jiasli Thanks for your help. I was following the instructions in a banner at the top of ADO after creating the manual service connection. The banner states:
Manually created service connections use an App Registration that was created by the user. Please add a federated credential to the App Registration with the following details: Issuer: https://vstoken.dev.azure.com/<org id>, Subject identifier: sc://<org>/<project>/<sc name>. Learn more
With a link to: https://learn.microsoft.com/en-us/azure/devops/pipelines/release/configure-workload-identity?view=azure-devops
I used the instructions to call the API here to get the org id: https://medium.com/@shivapatel1102001/get-list-of-organization-from-azure-devops-microsoft-account-861ea29dae93
@TomWildenhain, thanks for the information. If you used service principal secret to create the service connection, I don't think the federated identity credential added to the app is actually used.
@jiasli Is it possible to give any realistic timeline for a fix? I am wondering if it makes sense to ask for a rollback of the cli version contained in actions/runner-images that is used by both Github Actions and Azure DevOps.
We are seeing the same issue related to moving away from service principal secrets.
We are looking into adding logic for all Az CLI calls using the ARM token to ensure it gets refreshed (but not as a background process) to get the OIDC token from idToken
and reuse it to log in via az account clear && az login ...
If you can help to resolve that will be appreciated
I have exactly the same use case as @TomWildenhain. Is there a way to make the token valid period customable? We can't use Service principal as that's discouraged by the cred free best practices.
Even a workaround would be much appreciated.
Have the same issue for our long-running tasks:
[01:50:31 INF] ---> (Inner Exception #3) Azure.Identity.CredentialUnavailableException: Azure CLI authentication failed due to an unknown error. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/azclicredential/troubleshoot ERROR: AADSTS700024: Client assertion is not within its valid time range. Current time: 2024-06-01T01:50:31.1765304Z, assertion valid from 2024-06-01T00:49:55.0000000Z, expiry time of assertion 2024-06-01T00:59:55.0000000Z. Review the documentation at https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials . Trace ID: 48af9e38-7793-458d-94af-c2962d617700 Correlation ID: 0f495332-706e-4dba-a18e-1f844f5d7a7d Timestamp: 2024-06-01 01:50:31Z
[01:50:31 INF] Interactive authentication is needed. Please run:
[01:50:31 INF] az login
@jhwj9617 refer to the solution provided by Kapsztajn and @iamrk04, it works for me too.
Although I believe this is an unnecessary workaround which has to be done by users!
@panpanwa we are not using github actions. We're using AzureDevOps in yml, e.g.
- task: AzureCLI@2
displayName: Run load profile
inputs:
azureSubscription: $(federatedCredConnection)
scriptType: ps
scriptLocation: scriptPath
scriptPath: $(Pipeline.Workspace)/test.ps1
@panpanwa this is the stopgap solution that was shared by a colleague we can implement in our AzureCLI task
Start-Job -Name 'RefreshOidcToken' -ScriptBlock {
do {
Get-ChildItem -Path Env: -Recurse -Include ENDPOINT_DATA_* `
| Select-Object -First 1 -ExpandProperty Name `
| ForEach-Object { $_.Split("_")[2] } `
| Set-Variable serviceConnectionId
$oidcRequestUrl = "${env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI}${env:SYSTEM_TEAMPROJECTID}/_apis/distributedtask/hubs/build/plans/${env:SYSTEM_PLANID}/jobs/${env:SYSTEM_JOBID}/oidctoken?api-version=7.1-preview.1&serviceConnectionId=${serviceConnectionId}"
Invoke-RestMethod -Headers @{
Authorization = "Bearer $env:SYSTEM_ACCESSTOKEN"
'Content-Type' = 'application/json'
} -Uri "${oidcRequestUrl}" -Method Post | Set-Variable oidcTokenResponse
$oidcToken = $oidcTokenResponse.oidcToken
if (!$oidcToken) {
Write-Warning "OIDC token could not be acquired. Retrying..."
Start-Sleep -Seconds 30
continue
}
az account show -o json | ConvertFrom-Json | Set-Variable account
az login --service-principal -u $account.user.name --tenant $account.tenantId --allow-no-subscriptions --federated-token $oidcToken | Out-Null
Start-Sleep -Seconds 480 # 8 minutes
} while ($true)
} | Tee-Object -Variable refreshOidcTokenJob `
| Select-Object -ExcludeProperty Command `
| Write-Host -ForegroundColor DarkMagenta
# do long running work
Receive-Job $refreshOidcTokenJob
Stop-Job -Job $refreshOidcTokenJob
Remove-Job -Job $refreshOidcTokenJob
Also this seems to be in preview for v1.12.0-beta.2 https://github.com/Azure/azure-sdk-for-js/pull/29392
@panpanwa this is the stopgap solution that was shared by a colleague we can implement in our AzureCLI task
Start-Job -Name 'RefreshOidcToken' -ScriptBlock { do { Get-ChildItem -Path Env: -Recurse -Include ENDPOINT_DATA_* ` | Select-Object -First 1 -ExpandProperty Name ` | ForEach-Object { $_.Split("_")[2] } ` | Set-Variable serviceConnectionId $oidcRequestUrl = "${env:SYSTEM_TEAMFOUNDATIONCOLLECTIONURI}${env:SYSTEM_TEAMPROJECTID}/_apis/distributedtask/hubs/build/plans/${env:SYSTEM_PLANID}/jobs/${env:SYSTEM_JOBID}/oidctoken?api-version=7.1-preview.1&serviceConnectionId=${serviceConnectionId}" Invoke-RestMethod -Headers @{ Authorization = "Bearer $env:SYSTEM_ACCESSTOKEN" 'Content-Type' = 'application/json' } -Uri "${oidcRequestUrl}" -Method Post | Set-Variable oidcTokenResponse $oidcToken = $oidcTokenResponse.oidcToken if (!$oidcToken) { Write-Warning "OIDC token could not be acquired. Retrying..." Start-Sleep -Seconds 30 continue } az account show -o json | ConvertFrom-Json | Set-Variable account az login --service-principal -u $account.user.name --tenant $account.tenantId --allow-no-subscriptions --federated-token $oidcToken | Out-Null Start-Sleep -Seconds 480 # 8 minutes } while ($true) } | Tee-Object -Variable refreshOidcTokenJob ` | Select-Object -ExcludeProperty Command ` | Write-Host -ForegroundColor DarkMagenta # do long running work Receive-Job $refreshOidcTokenJob Stop-Job -Job $refreshOidcTokenJob Remove-Job -Job $refreshOidcTokenJob
This might work in 99% of the cases but is not completely reliable; beware of race conditions.
Azure DevOps's document now also explains AADSTS700024
:
https://learn.microsoft.com/en-us/azure/devops/pipelines/release/troubleshoot-workload-identity
AADSTS700024: Client assertion is not within its valid time range
If the error happens after approximately 1 hour, use a service connection with Workload identity federation and a Managed Identity instead. Managed Identity tokens have a lifetime of around 24 hours. If the error happens before 1 hour but after 10 minutes, move commands that (implicitly) request an access token to e.g. access Azure storage to the beginning of your script. The access token will be cached for subsequent commands.
Do we have any updates on the issue? A lot of our ADO pipelines are intermittently failing and we have been asked to move away from service principals to be cred free.
The PR linked is still in draft state https://github.com/Azure/azure-cli/pull/28778
Azure DevOps's document now also explains
AADSTS700024
:https://learn.microsoft.com/en-us/azure/devops/pipelines/release/troubleshoot-workload-identity
AADSTS700024: Client assertion is not within its valid time range If the error happens after approximately 1 hour, use a service connection with Workload identity federation and a Managed Identity instead. Managed Identity tokens have a lifetime of around 24 hours. If the error happens before 1 hour but after 10 minutes, move commands that (implicitly) request an access token to e.g. access Azure storage to the beginning of your script. The access token will be cached for subsequent commands.
Thanks @jiasli! This works for my use case!
I got same error for the time duration between 10 min to 1 hour, as mentioned on the Microsoft Docs as mentioned in the docs we have access storage account at beginning but in terraform apply we cannot manage by ourselves.
I'm using terraform apply the pipeline running around 10 min and then gives below error:
error loading state: Error retrieving keys for Storage Account "teestmgmt": autorest/Client#Do: Preparing request failed: StatusCode=0 -- Original Error: clientCredentialsToken: received HTTP status 401 with response: {"error":"invalid_client","error_description":"AADSTS700024: Client assertion is not within its valid time range. Current time: 2024-06-21T10:41:11.0510669Z, assertion valid from 2024-06-19T02:32:13.0000000Z, expiry time of assertion 2024-06-19T02:42:13.0000000Z. Review the documentation at https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials . Trace ID: Correlation ID: Timestamp: 2024-06-21 10:41:11Z","error_codes":[700024],"timestamp":"2024-06-21 10:41:11Z","trace_id":"","correlation_id":"","error_uri":"https://login.microsoftonline.com/error?code=700024"}
Acquiring access token with expired OIDC token fails with:
As the error indicates, the OIDC token is only valid for 10 minutes. After it is passed to
az login
via--federated-token
, Azure CLI cannot get a new OIDC token after the OIDC token expires.This is the designed v1 behavior of OIDC token support (#19853).
However, as Azure DevOps task AzureCLI@2 (https://github.com/microsoft/azure-pipelines-tasks/pull/17633) and GitHub Action azure/login@v2 (https://github.com/Azure/login/pull/147) have supported OIDC token authentication, and it is recommended to use workload identity federation, this limitation is becoming more prevailing.
Possible solutions
References