microsoft / CromwellOnAzure

Microsoft Genomics implementation of the Broad Institute's Cromwell workflow engine on Azure
MIT License
134 stars 55 forks source link

Increase retries in deployer #819

Open BMurri opened 1 month ago

BMurri commented 1 month ago

Describe the bug A clear and concise description of what the bug is.

Steps to Reproduce Steps to reproduce the behavior:

Expected behavior A clear and concise description of what you expected to happen.

Deployment details: (any information you can provide would be helpful):

Logs

2024-09-30T09:54:43.7928589Z AggregateException: Retry failed after 6 tries. Retry settings can be adjusted in ClientOptions.Retry or by configuring a custom retry policy in ClientOptions.RetryPolicy. (Connection refused (tesinttest10d63002221.blob.core.windows.net:443)) (Connection refused (tesinttest10d63002221.blob.core.windows.net:443)) (Connection refused (tesinttest10d63002221.blob.core.windows.net:443)) (Connection refused (tesinttest10d63002221.blob.core.windows.net:443)) (An error occurred while sending the request.) (An error occurred while sending the request.)
2024-09-30T09:54:43.7970039Z    at Azure.Core.Pipeline.RetryPolicy.ProcessAsync(HttpMessage message, ReadOnlyMemory`1 pipeline, Boolean async)
2024-09-30T09:54:43.7971184Z    at Azure.Core.Pipeline.HttpPipelineSynchronousPolicy.InnerProcessAsync(HttpMessage message, ReadOnlyMemory`1 pipeline)
2024-09-30T09:54:43.7972139Z    at Azure.Storage.Blobs.ContainerRestClient.CreateAsync(Nullable`1 timeout, IDictionary`2 metadata, Nullable`1 access, String defaultEncryptionScope, Nullable`1 preventEncryptionScopeOverride, CancellationToken cancellationToken)
2024-09-30T09:54:43.7973734Z    at Azure.Storage.Blobs.BlobContainerClient.CreateInternal(PublicAccessType publicAccessType, IDictionary`2 metadata, BlobContainerEncryptionScopeOptions encryptionScopeOptions, Boolean async, CancellationToken cancellationToken, String operationName)
2024-09-30T09:54:43.7976000Z    at Azure.Storage.Blobs.BlobContainerClient.CreateIfNotExistsInternal(PublicAccessType publicAccessType, IDictionary`2 metadata, BlobContainerEncryptionScopeOptions encryptionScopeOptions, Boolean async, CancellationToken cancellationToken)
2024-09-30T09:54:43.7976691Z    at Azure.Storage.Blobs.BlobContainerClient.CreateIfNotExistsAsync(PublicAccessType publicAccessType, IDictionary`2 metadata, BlobContainerEncryptionScopeOptions encryptionScopeOptions, CancellationToken cancellationToken)
2024-09-30T09:54:43.7978085Z    at CromwellOnAzureDeployer.Deployer.UploadTextToStorageAccountAsync(BlobClient blobClient, String content, CancellationToken cancellationToken) in /mnt/vss/_work/r1/a/CromwellOnAzure/src/deploy-cromwell-on-azure/Deployer.cs:line 2475
2024-09-30T09:54:43.7979326Z    at CromwellOnAzureDeployer.KubernetesManager.UpdateHelmValuesAsync(StorageAccountData storageAccount, Uri keyVaultUrl, String resourceGroupName, Dictionary`2 settings, UserAssignedIdentityData managedId, List`1 containersToMount) in /mnt/vss/_work/r1/a/CromwellOnAzure/src/deploy-cromwell-on-azure/KubernetesManager.cs:line 188
2024-09-30T09:54:43.7980276Z    at CromwellOnAzureDeployer.Deployer.DeployAsync() in /mnt/vss/_work/r1/a/CromwellOnAzure/src/deploy-cromwell-on-azure/Deployer.cs:line 615

Additional context Add any other context about the problem here.