microsoft / azure-pipelines-tasks

Tasks for Azure Pipelines
https://aka.ms/tfbuild
MIT License
3.47k stars 2.61k forks source link

HelmDeploy@0 doesn't auto handle new ARM throttling #19947

Open AnthonyDewhirst opened 3 months ago

AnthonyDewhirst commented 3 months ago

New issue checklist

Task name

HelmDeploy

Breaking task version

No response

Last working task version

No response

Regression Description

Now that Microsoft have started rolling out throttling on ARM endpoints (since may 2024) we are seeing our pipelines fail each day. One of the areas is this task, which is failing stating: Error: Get "https://***.azurecr.io/v2/": unknown: The total number of requests per subscription/tenant has exceeded the allowed limits and hence the request has been throttled. Please try after the time period indicated by 'Retry-After' header.

This should now be built in to all tasks be default IMHO

Environment type (Please select at least one enviroment where you face this issue)

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

Current agent version: '3.239.1'

Operation system

Ubuntu

Relevant log output

Skip to main content
***
/
***
/
Pipelines
/
***
/
20240604.1 (***)
Search

Account manager for ***

Jobs in run #20240604.1 (***)
***
Helm login

View raw log

Starting: Helm login
==============================================================================
Task         : Package and deploy Helm charts
Description  : Deploy, configure, update a Kubernetes cluster in Azure Container Service by running helm commands
Version      : 0.238.1
Author       : Microsoft Corporation
Help         : https://aka.ms/azpipes-helm-tsg
==============================================================================
/azp/agent/_work/_tool/helm/3.6.0/x64/linux-amd64/helm registry login ***.azurecr.io --username *** --password ***
WARNING: Using --password via the CLI is insecure. Use --password-stdin.
Error: Get "https://***.azurecr.io/v2/": unknown: The total number of requests per subscription/tenant has exceeded the allowed limits and hence the request has been throttled. Please try after the time period indicated by 'Retry-After' header.
##[error]WARNING: Using --password via the CLI is insecure. Use --password-stdin.
Error: Get "https://***.azurecr.io/v2/": unknown: The total number of requests per subscription/tenant has exceeded the allowed limits and hence the request has been throttled. Please try after the time period indicated by 'Retry-After' header.

Finishing: Helm login
Row 2

Showing 25 filtered items.

Get started and run this pipeline for the first time!

Showing 50 filtered items.

Showing 25 filtered items.

Row 5

Row 4. Clickable

Full task logs with system.debug enabled

UNSUCCESSFUL RUN
 [REPLACE THIS WITH YOUR INFORMATION] 
SUCCESSFUL RUN
 [REPLACE THIS WITH YOUR INFORMATION] 

Repro steps

this is intermittent based on the throttle occurring
v-schhabra commented 3 months ago

Now that Microsoft have started rolling out throttling on ARM endpoints (since may 2024) we are seeing our pipelines fail each day. Hi @AnthonyDewhirst Could you please share the source from where you got this information? And please share the complete debug logs by adding variable system.debug to "true".

AnthonyDewhirst commented 3 months ago

Hi @v-schhabra , link here: https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/request-limits-and-throttling#migrating-to-regional-throttling-and-token-bucket-algorithm

The log output above shows that we have hit a throttle point.

As for more output, unfortunately, for the moment, we have taken the decision not to re-create our test envs each day as we are hitting crucial deadlines and also, as hitting this on any given call is indeterminate it may take a few attempts, which is several hours to reproduce.

I am hoping that we will start again in the next few weeks, but other areas have been affected by recent issues with Microsoft side changes or issues so it's a risk to go back at the moment