Azure / AzOps

AzOps is a PowerShell module which deploys (Push) ARM Resource Templates & Bicep files at all Azure scope levels and exports (Pull) ARM resource hierarchy.
https://aka.ms/AzOps
MIT License
388 stars 163 forks source link

Validate step failing to fetch artefact from container registryFeature Request #894

Open pareion opened 3 months ago

pareion commented 3 months ago

Please describe the feature.

Intro Hello!

We've forked the AzOps-Accelerator and set it up across all our environments. We've then set up a nightly pipeline that pushes some configuration to our AzOps repository, creates a pull request, and waits for the pull request pipeline to complete before auto-completing the pull request. In that pull request pipeline, we're experiencing an issue.

The issue is not consistent, and a simple rerun of the pipeline will often fix the issue; however, since we don't want any manual intervention in the nightly pipeline, it becomes a little bit of an issue

Problem The problem in the pull request pipeline is that it fails in the Validate step. It fails with the following error:

 ---> Azure.Identity.AuthenticationFailedException: Azure PowerShell authentication timed out.
   at Azure.Identity.AzurePowerShellCredential.RequestAzurePowerShellAccessTokenAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken)
   at Azure.Identity.AzurePowerShellCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
   at Azure.Identity.CredentialDiagnosticScope.FailWrapAndThrow(Exception ex, String additionalMessage, Boolean isCredentialUnavailable)
   at Azure.Identity.AzurePowerShellCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
   at Azure.Identity.AzurePowerShellCredential.GetTokenAsync(TokenRequestContext requestContext, CancellationToken cancellationToken)
   at Azure.Identity.ChainedTokenCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Azure.Identity.ChainedTokenCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
   at Azure.Identity.ChainedTokenCredential.GetTokenAsync(TokenRequestContext requestContext, CancellationToken cancellationToken)
   at Azure.Containers.ContainerRegistry.ContainerRegistryRefreshTokenCache.GetRefreshTokenFromCredentialAsync(TokenRequestContext context, String service, Boolean async, CancellationToken cancellationToken)
   at Azure.Containers.ContainerRegistry.ContainerRegistryRefreshTokenCache.GetAcrRefreshTokenAsync(HttpMessage message, TokenRequestContext context, String service, Boolean async)
   at Azure.Containers.ContainerRegistry.ContainerRegistryRefreshTokenCache.GetAcrRefreshTokenAsync(HttpMessage message, TokenRequestContext context, String service, Boolean async)
   at Azure.Containers.ContainerRegistry.ContainerRegistryChallengeAuthenticationPolicy.AuthorizeRequestOnChallengeAsyncInternal(HttpMessage message, Boolean async)
   at Azure.Core.Pipeline.BearerTokenAuthenticationPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory`1 pipeline, Boolean async)
   at Azure.Core.Pipeline.RedirectPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory`1 pipeline, Boolean async)
   at Azure.Core.Pipeline.RetryPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory`1 pipeline, Boolean async)
   at Azure.Core.Pipeline.RetryPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory`1 pipeline, Boolean async)
   at Azure.Containers.ContainerRegistry.ContainerRegistryRestClient.GetManifestAsync(String name, String reference, String accept, CancellationToken cancellationToken)
   at Azure.Containers.ContainerRegistry.ContainerRegistryContentClient.GetManifestInternalAsync(String reference, Boolean async, CancellationToken cancellationToken)
   at Azure.Containers.ContainerRegistry.ContainerRegistryContentClient.GetManifestAsync(String tagOrDigest, CancellationToken cancellationToken)
   at Bicep.Core.Registry.AzureContainerRegistryManager.DownloadManifestAndLayersAsync(IOciArtifactReference artifactReference, ContainerRegistryContentClient client)
   at Bicep.Core.Registry.AzureContainerRegistryManager.<>c__DisplayClass4_0.<<PullArtifactAsync>g__DownloadManifestInternalAsync|0>d.MoveNext()
--- End of stack trace from previous location ---
   at Bicep.Core.Registry.AzureContainerRegistryManager.PullArtifactAsync(RootConfiguration configuration, IOciArtifactReference artifactReference)
   at Bicep.Core.Registry.OciArtifactRegistry.TryRestoreArtifactAsync(RootConfiguration configuration, OciArtifactReference reference)

I have no insight into what it actually does, but it seems that it fails to retrieve the artefact from the Azure Container Registry. I've verified the artefact is in the registry.

Notes/Request We've worked quite a bit with Azure Container Registry, and in other pipelines/templates, we've introduced retry logic for handling once in a blue moon errors when it fails to retrieve an artefact from the container registry.

I was wondering if it was possible to add a form of retry logic or, perhaps, if it's already in place, extend the retry logic.