Open andrey-dubnik opened 2 years ago
@andrey-moor I've applied the configuration as you provided above (a typo in the account_replication_type
, which should be RAGZRS
rather than RAZGRS
). Then I started a loop to run terraform plan
until it returns error. With a bunch of iterations, the issue doesn't appear.
So would you please follow the terraform debug guide and provide the debug log here? Mostly that is due to some API issues, while we will need the "request id" and other context info for further contacting the Azure supports.
Forgot to mention the issue was reproducible only via GH actions and was intermittent. It may be down to the Azure portal replication etc. as GH hosts runners in multiple location and 1st available runner is selected for the pipeline.
East US (eastus) East US 2 (eastus2) West US 2 (westus2) Central US (centralus) South Central US (southcentralus)
Similar behaviour is affecting the KeyVault but with much less frequency
Let me see if I can reproduce the issue and capture the debug data via the standalone pipeline, we have switched to ZGRS accounts to unblock the delivery so original config is not available.
I'm seeing this on creation:
Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="StorageAccountNotFound" Message="The storage account testsa was not found."
with 2.93.1 with ZGRS type.
Related: https://github.com/hashicorp/terraform-provider-azurerm/issues/5299#issuecomment-980079517
@magodo sorry took me a while, I have captured the trace debug of the problem.
suspect it is related to the call which comes back empty - https://docs.microsoft.com/en-us/rest/api/storagerp/storage-accounts/list
2022-02-03T16:14:30.610Z [DEBUG] provider.terraform-provider-azurerm_v2.94.0_x5: AzureRM Response for https://management.azure.com/subscriptions/264f194d-71cc-41d9-9bf4-c4f5456285e8/providers/Microsoft.Storage/storageAccounts?api-version=2021-04-01:
HTTP/2.0 200 OK
Cache-Control: no-cache
Content-Type: application/json; charset=utf-8
Date: Thu, 03 Feb 2022 16:14:30 GMT
Expires: -1
Pragma: no-cache
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Ms-Correlation-Request-Id: fc3a2733-bb7a-9fdc-09e6-d9510d90bf10
X-Ms-Request-Id: 612cc49f-6921-4522-9a25-c7eaa4e7b3b2
X-Ms-Routing-Request-Id: EASTUS2:20220203T161430Z:612cc49f-6921-4522-9a25-c7eaa4e7b3b2
{"value":[]}: timestamp=2022-02-03T16:14:30.609Z
Seem the workaround is to create a TAG on the storage account (via portal), after that the issue goes away and above API returns the value. More to it it started to return value for all the storage account in that sub which were having the issue...
The workaround came from the #11059 as we've seen similar but not that permanent issue with the keyvault. The root cause is highly likely to be the same for both of those issues.
@andrey-dubnik Just to be sure, the LIST
call for the SA on your test sub always return an empty list, even after waiting for a long while? Is there any other SA in that sub?
This is correct, terraform was able to obtain the keys and all the data for the account but the list api returned blanc hence the account can't be found error.
There was another account in the sub which was having an issue originally so in total there are 2 accounts in there. After tagging at least one the second account also appeared in the api call.
I can add that tagging a storage account worked on my side as well. My storage account was created like this:
resource "azurerm_storage_account" "sa" {
name = var.storage_account_name
location = var.location
account_tier = "Standard"
account_replication_type = "GRS"
resource_group_name = azurerm_resource_group.rg.name
}
///It worked 1 out of 8 times, or something, before adding a manual tag. Now it seems stable. ///
Edit: Nope, it is not stable at all with the tag thing either.
This issue is also causing sleepless nights on our side here. What I can say so far: most likely this issue is caused by some race condition hitting sorts of ARM API limits. We're experiencing this on a Terraform project with around 100 resources (1 storage account, 1 key vault, a lot of role assignments to them) and I'm not able to reproduce the issue in a smaller project. However, calling terraform apply -parallelism=1
seems to help.
Furthermore: We're experiencing the very same behaviour also on Key Vaults (is switches randomly from Storage Account to Key Vault, whatever Terraform tries to access first). Here is a log example from that:
2022-03-03T06:57:41.0852887Z 2022-03-03T06:57:40.565Z [INFO] Starting apply for azurerm_key_vault_secret.githubtoken
2022-03-03T06:57:41.0853464Z 2022-03-03T06:57:40.565Z [DEBUG] azurerm_key_vault_secret.githubtoken: applying the planned Update change
2022-03-03T06:57:41.0854255Z 2022-03-03T06:57:40.567Z [INFO] provider.terraform-provider-azurerm_v2.98.0_x5: preparing arguments for AzureRM KeyVault Secret update.: timestamp=2022-03-03T06:57:40.566Z
2022-03-03T06:57:41.0854946Z 2022-03-03T06:57:40.567Z [DEBUG] provider.terraform-provider-azurerm_v2.98.0_x5: AzureRM Request:
2022-03-03T06:57:41.0855810Z GET /subscriptions/***/resources?%24filter=resourceType+eq+%27Microsoft.KeyVault%2Fvaults%27+and+name+eq+%27net-prod-839977%27&%24top=5&api-version=2020-06-01 HTTP/1.1
2022-03-03T06:57:41.0856269Z Host: management.azure.com
2022-03-03T06:57:41.0857164Z User-Agent: Go/go1.17.5 (amd64-linux) go-autorest/v14.2.1 Azure-SDK-For-Go/v61.4.0 resources/2020-06-01 HashiCorp Terraform/1.1.6 (+https://www.terraform.io) Terraform Plugin SDK/2.10.1 terraform-provider-azurerm/2.98.0 pid-222c6c49-1b0a-5959-a213-6608f9eb8820
2022-03-03T06:57:41.0857908Z X-Ms-Correlation-Request-Id: b01b3074-24a4-9613-dd3f-c0a93338159e
2022-03-03T06:57:41.0858375Z Accept-Encoding: gzip: timestamp=2022-03-03T06:57:40.566Z
2022-03-03T06:57:41.0859118Z 2022-03-03T06:57:40.635Z [ERROR] vertex "azurerm_key_vault_secret.githubtoken" error: Unable to determine the Resource ID for the Key Vault at URL "https://net-prod-839977.vault.azure.net/"
2022-03-03T06:57:41.0860401Z 2022-03-03T06:57:40.635Z [DEBUG] provider.terraform-provider-azurerm_v2.98.0_x5: AzureRM Response for https://management.azure.com/subscriptions/***/resources?%24filter=resourceType+eq+%27Microsoft.KeyVault%2Fvaults%27+and+name+eq+%27net-prod-839977%27&%24top=5&api-version=2020-06-01:
2022-03-03T06:57:41.0861012Z HTTP/2.0 200 OK
2022-03-03T06:57:41.0861306Z Cache-Control: no-cache
2022-03-03T06:57:41.0861671Z Content-Type: application/json; charset=utf-8
2022-03-03T06:57:41.0861971Z Date: Thu, 03 Mar 2022 06:57:39 GMT
2022-03-03T06:57:41.0862330Z Expires: -1
2022-03-03T06:57:41.0862605Z Pragma: no-cache
2022-03-03T06:57:41.0863019Z Strict-Transport-Security: max-age=31536000; includeSubDomains
2022-03-03T06:57:41.0863386Z Vary: Accept-Encoding
2022-03-03T06:57:41.0863725Z X-Content-Type-Options: nosniff
2022-03-03T06:57:41.0864181Z X-Ms-Correlation-Request-Id: b01b3074-24a4-9613-dd3f-c0a93338159e
2022-03-03T06:57:41.0864686Z X-Ms-Ratelimit-Remaining-Subscription-Reads: 11997
2022-03-03T06:57:41.0865152Z X-Ms-Request-Id: f6f7065a-d272-43ec-9bf3-526a820189ac
2022-03-03T06:57:41.0865654Z X-Ms-Routing-Request-Id: WESTUS3:20220303T065740Z:f6f7065a-d272-43ec-9bf3-526a820189ac
2022-03-03T06:57:41.0865936Z
2022-03-03T06:57:41.0866144Z {"value":[]}: timestamp=2022-03-03T06:57:40.634Z
@roehrijn , I have tried with parallelism 1 now, and it does not resolve any issue when it comes to storage accounts at least. Maybe that workaround only works for key vaults?
Hi @mariussm, it also works for storage accounts in my environment. However, as a wrote, this is unfortunately likely to be some sort of race condition in rate limiting. That's why I think parallelism=1 is not a 100% fix/workaround. Hope MS is going to address this soon.
@roehrijn / @andrey-dubnik - we've experienced this issue over the past month or so: We found that a call to the /resources?%24filter=resourceType+eq+%27Microsoft.KeyVault%
endpoint provided different results depending on the region that serviced that ARM API call. Essentially if the X-Ms-Routing-Request-Id
value was the same as the location of the resource, the call returned the key vault - but if not it sporadically would return an empty array. This had a downstream effect on the rest of the TF deployment as it treated the resource as missing. It would be interesting for you to check the X-Ms-Routing-Request-Id
of your calls to see if there's a correlation between successful and failed ones.
We have an open case with the ARM API team, and so far they've confirmed that it's an issue with the ARM API cross-region Cache not being updated quick enough.
In terms of fixes / workarounds - they're currently a bit limited:
cc: @stuartleeks
Using tags as a workaround worked so far and the portal cache was replicated. There were no re-occurrences of the issue since tagging.
Since this is a provider-api scope there is no way to influence it externally. If this is a replication lag and not a permanent issue then adding a retry logic would probably help in mitigating the issue as worst case it would be 1 min SLO in oppose to an error which is already good enough.
If tagging permanently fixes the issue maybe the api team can use this in the replication fix...
Hello .
I had also experienced same Intermittent issue i.e Unable to locate Storage Account for GRS S account. Root cause of this issue is ARM cache sync issue. Contact to Microsoft to resync the storage account .
It working for me post resync of ARM cache
Thanks Tushar P
I got this issue also.
az storage account list
was listing []
adding tag did fix the problem so indeed this looks like stale cache problem.
We're experiencing this problem in GitHub actions too.
Run hashicorp/setup-terraform@v1.3.2
provider registry.terraform.io/hashicorp/azurerm v3.10.0
I am experiencing the same problem on my local machine. Here is the repo.
I run these commands and I encounter this specifically when I run the destroy commands.
terraform plan -destroy -out main.destroy.tfplan
terraform apply main.destroy.tfplan
│ Error: Unable to locate Storage Account "staticwebsiteprfpim"! │
The following shows plan command.
I get that error when I run the command for the first time. When I re-run the same command the second time, things run fine.
Same is the case with apply command as well.
When using GH Actions in combination with AzureRM and account_replication_type LRS you have the same problem. When I run the same terraform apply locally I don't have any issues. Seems related to GH actions.
azrumrm version 3.12.0
resource "azurerm_storage_account" "redacted" {
name = local.storage_account_name
resource_group_name = azurerm_resource_group.redacted.name
location = azurerm_resource_group.redacted.location
account_tier = "Standard"
account_replication_type = "LRS"
}
Using azurerm 3.16.0, storage account type LRS, this happens fairly frequently.
@ekristen and @paalders could you please help confirm that the storage account list API (i.e. /subscriptions/{subscriptionId}/providers/Microsoft.Storage/storageAccounts
) doesn't return the expected storage account, while the ARG API does? If that is the case, we might consider migrating to use ARG instead.
Same problem in my case:
Terraform v.1.2.6 azurerm: v.3.17.0
It does not count if SA is LRS, od ZRS, 7 times on 10 I get error: "Unable to locate Storage Account".
Also same issue for me on github actions + terraform cloud. Also LRS type for replication. The storage account cannot be found.
I can see it on the azure dashboard and on my local machine the command line tool return the correct list.
Terraform v.1.2.6 azurerm: v.3.17.0
@MLKiiwy Just want to be sure, which command you were running that returns you the correct list? Is it the "resource list" or Azure Resource Graph query?
I am facing the same issue. with azurerm 3.21.0. Any workarounds ?
Randomly impacting LRS Storage accounts azurerm 3.5.0. / TFC Quite annoying...
Got the same issue, just created a storage account through TF and 5 minutes later it can't find it.
Still seeing this for LRS storage accounts as of Terraform CLI 1.3.7 + azurerm
3.37.0.
Seems like it's been an issue for years at this point: #5299
This is still a problem. Right now I have a terraform that simply won't find a storage account on refresh even though it's clearly right there.
I opened a case with MS about this issue in Oct 2022 and was told that ARM v W39+ was supposed to solve this issue on the backend. I mention this only to suggest others open a case via Azure portal to push the backend team while we wait for the provider authors do whatever they can. Case to reference: 2210070040004240
This problem can be a show-stopper for days when it pops up....
From: Erik Kristensen @.> Sent: Friday, January 13, 2023 11:56:52 AM To: hashicorp/terraform-provider-azurerm @.> Cc: Mucci, Eric /BSS LGA @.>; Comment @.> Subject: Re: [hashicorp/terraform-provider-azurerm] Intermittent Error: Unable to locate Storage Account when RAGRS/RAZGRS account kind is used (Issue #15048)
This is still a problem. Right now I have a terraform that simply won't find a storage account on refresh even though it's clearly right there.
— Reply to this email directly, view it on GitHubhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhashicorp%2Fterraform-provider-azurerm%2Fissues%2F15048%23issuecomment-1382134224&data=05%7C01%7Ceric.mucci%40beiersdorf.com%7C5153c983b5a04da89cfd08daf5872c3d%7C631f985f427e4921a153c467ec975fb6%7C0%7C0%7C638092258160045743%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mXQL0NqUpIrQXmDrtEIk6FRVWW%2FwqP1BU4DXKz7%2BLQM%3D&reserved=0, or unsubscribehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJFGPKQT4VWBNUQH3OYLGXTWSGCNJANCNFSM5MNL5SWQ&data=05%7C01%7Ceric.mucci%40beiersdorf.com%7C5153c983b5a04da89cfd08daf5872c3d%7C631f985f427e4921a153c467ec975fb6%7C0%7C0%7C638092258160045743%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CASxRNS%2F0ubSuwzsLT%2FVN2ZOkd6VCXucUQM1tfxr5Vc%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>
This electronic transmission is strictly confidential and intended solely for the addressee. It may contain information which is covered by legal, professional or other privilege. If you are not the intended addressee, you must not disclose, copy or take any action in reliance of this transmission. If you have received this transmission in error, please notify us and delete the received data as soon as possible.
This footnote also confirms that this email message has been swept for the presence of computer viruses. Information on the processing of personal data can be found here: https://www.beiersdorf.com/meta-pages/privacy-policy#contacting Information on the processing of personal data for tesa can be found here: https://www.tesa.com/en/about-tesa/legal-information/privacy-policy
Seem the workaround is to create a TAG on the storage account (via portal), after that the issue goes away and above API returns the value. More to it it started to return value for all the storage account in that sub which were having the issue...
I had the same issue today for account_replication_type = "ZRS"
, StorageV2 and this assumption helped, not sure, how...
I just added the tags manually in the portal for all storage accounts which were listed in the TF error log with "Unable to locate Storage Account"
There's definitely a bug that needs to be fixed
Following up here, the information previously provided by @andrey-dubnik and @damoodamoo was spot on.
Though I won't repeat what they already alluded to above, there is apparently an additional alternative that was suggested to me by Azure Support when working through my ticket (2304260010001874, in case anyone else wants a pointer to this reference).
The Azure Resource Manager team has recently made significant improvements to the replication flows and mechanisms, including large-scale enhancements of Azure's replication architecture, which should entirely address this issue for list calls at the resource group level. The improvements for subscription-level list calls will be rolling out incrementally to tenants over the next few months. However, we still recommend that you use Azure Resource Graph whenever possible for resource list calls. Azure Resource Graph offers a one-minute SLO and is designed to performantly handle list calls at scale.
When creating the storage account, it looks like the behavior within resourceStorageAccountRead
checks out:
ListKeys
call is made to determine that the account has been, using the provided resource group
https://management.azure.com/subscriptions/<subscription ID>/resourceGroups/<resource group name>/providers/Microsoft.Storage/storageAccounts/<storage account name>/listKeys
FindAccount
is made, which makes a subscription-level call to the ARM API
https://management.azure.com/subscriptions/<subscription ID>/providers/Microsoft.Storage/storageAccounts
Instead, this code path could use the ListByResourceGroup
method on the AccountsClient
(imported from github.com/Azure/azure-sdk-for-go/services/storage/mgmt/2021-09-01/storage
).
Unfortunately, I don't have the cycles right now to attempt and test any changes against what we currently have running (our usage of the Terraform CLI makes it tricky to do so), but I'd be happy to come back to do so in a few weeks if someone hasn't been able to add more evidence.
cc @magodo
@brad-lucas Thank you for the news! We shall probably wait for the subscription-level list mitigate the caching issue. The FindAccount
can't use the ListByResourceGroup
unfortunately as this method is used among multiple places in the provider, where many of them only has access to the storage account's name.
@magodo, can I convince you that the resource group name could/should be used when it's available, creating a new fork in the code paths? List queries are always going to be eventually consistent, but consistency guarantees are much more important when creating a storage account.
This code path has intermittently, but consistently, broken our automated deployments for almost a year at this point, doesn't have any workaround from a pure automation standpoint, and would be the same for anyone else using similar automation.
For the create case:
@brad-lucas, we've made a bunch of tries to workaround the caching issue, see #11059. Note that the resouce group name is not always available, e.g. during a read for a azurerm_storage_container
.
For your specific ask:
I am also facing this issue intermittently but I am using storage account tier as LRS.
Azure provider - 3.69, tried with 3.78 also same error Terraform version - 1.5.5 and tried with 1.5.7
Error
│ Error: unable to locate Storage Account "iseappstoragefhq2" │ │ with module.loadbalancer_dns.azurerm_storage_account.ise-app-storage, │ on ../../module/loadbalancer_dns/functionapp.tf line 8, in resource "azurerm_storage_account" "ise-app-storage": │ 8: resource "azurerm_storage_account" "ise-app-storage" {
Debug log
ount=1 diagnostic_warning_count=0 tf_rpc=ReadResource timestamp=2023-10-27T13:39:56.639+0530 2023-10-27T13:39:56.646+0530 [ERROR] provider.terraform-provider-azurerm_v3.69.0_x5: Response contains error diagnostic: diagnostic_summary="unable to locate Storage Account "iseappstoragefhq2"" tf_resource_type=azurerm_storage_account @caller=github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/internal/diag/diagnostics.go:55 diagnostic_severity=ERROR diagnostic_detail= tf_proto_version=5.3 tf_provider_addr=provider tf_req_id=0933de58-93db-0874-0dd6-a134016be01d tf_rpc=ReadResource @module=sdk.proto timestamp=2023-10-27T13:39:56.639+0530 2023-10-27T13:39:56.644+0530 [ERROR] vertex "module.loadbalancer_dns.azurerm_storage_account.ise-app-storage" error: unable to locate Storage Account "iseappstoragefhq2"
For me this kept happening when I deployed even a completely new set of resources to new rg. After deploying the storage accounts, the plan
stage failed because it did not find the storage account that was just deployed. Also az storage account list
showed empty results. When I did az login
and az account set --subscription...
again, it finds the storage account and plan works.
Still facing this issue intermittently on Storage Account Standard_LRS StorageV2.
Terraform version: 1.7.4 Azure provider: 3.91.0
Error on terraform apply
Error: retrieving queue properties for Storage Account (Subscription: "xxx" Resource Group Name: "xxx" Storage Account Name: "xxx"): queues.Client#GetServiceProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthenticationFailed" Message="Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.\ with azurerm_storage_account.st, on xxx line 17, in resource "azurerm_storage_account" "st": 17: resource "azurerm_storage_account" "st" {
Error on terraform destroy
╷ │ Error: unable to locate "Storage Account (Subscription: \"xxxx\"\nResource Group Name: \"xxx\"\nStorage Account Name: | \"xxx\")" │ │ with azurerm_storage_account.st, │ on xxx line 17, in resource "azurerm_storage_account" "st": │ 17: resource "azurerm_storage_account" "st" { │
I got this issue today, while deploying new resources into a clean subscription. Not sure what this is about, i was creating 158 new resources, out of that 3 were storage accounts.
We are also running into this.
Same issue for me on two storage accounts with an account replication type of "GRS"
Terraform version: 1.7.5 Azure provider: 3.92.0
same issue I just encountered.
Same issue with type of "LRS"
Terraform version: 1.7.5 Azure provider: 3.95.0
Same !
Still facing with terraform 1.7.5
and azurerm 3.97.1
got this issue as well:
TF version: 1.7.4 Azure provider: 3.97.1
Same here :(
Terraform v1.7.5
hashicorp/azurerm v3.98.0
Same here.
Terraform v1.8.0
.\.terraform\providers\registry.terraform.io\hashicorp\azurerm\3.98.0\windows_386\terraform-provider-azurerm_v3.98.0_x5.exe
It works in one run, but stopped working majority of the time.
resource "azurerm_storage_account" "storage-account" {
resource_group_name = "storage-rg"
location = "eastus"
name = "stgacestus3839"
account_tier = "Standard"
account_replication_type = "LRS"
min_tls_version = "TLS1_2"
}
Powershell equivalent is working fine though
Get-AzStorageAccount -ResourceGroupName "storage-rg"
# this returns the storage account properly
Seems the unreleased version 3.99.0 has some fixes around storage account? https://github.com/hashicorp/terraform-provider-azurerm/blob/main/CHANGELOG.md#3990-unreleased
Anyone know when this could be released?
Same here.
Terraform v1.8.0 .\.terraform\providers\registry.terraform.io\hashicorp\azurerm\3.98.0\windows_386\terraform-provider-azurerm_v3.98.0_x5.exe
It works in one run, but stopped working majority of the time.
resource "azurerm_storage_account" "storage-account" { resource_group_name = "storage-rg" location = "eastus" name = "stgacestus3839" account_tier = "Standard" account_replication_type = "LRS" min_tls_version = "TLS1_2" }
Powershell equivalent is working fine though
Get-AzStorageAccount -ResourceGroupName "storage-rg" # this returns the storage account properly
Seems the unreleased version 3.99.0 has some fixes around storage account? https://github.com/hashicorp/terraform-provider-azurerm/blob/main/CHANGELOG.md#3990-unreleased
Anyone know when this could be released?
Womp womp. Looks like 3.99.0 was released last night, but it doesn't fix the issue. Still seeing
Planning failed. Terraform encountered an error while generating this plan.
╷
│ Error: unable to locate "Storage Account (Subscription: \"*****\"\nResource Group Name: \"*****\"\nStorage Account Name: \"****\")"
terraform init -upgrade
# Upgraded to terraform-provider-azurerm_v3.99.0_x5.exe (from 3.98.0)
Community Note
Terraform (and AzureRM Provider) Version
2.29.0
Affected Resource(s)
azurerm_storage_account intermittently produces an error on plan for the already created resource
Error: Unable to locate Storage Account
Terraform Configuration Files
Expected Behaviour
Should be no error for the RA and non RA accounts
Actual Behaviour
intermittently produces an error on plan for the already created resource
Error: Unable to locate Storage Account
Steps to Reproduce
terraform apply
terraform plan
Important Factoids
Only affecting RA storage accounts