Open jwshive opened 4 years ago
I assume you are using Azure Storage Account here. Not that this will help with what caused the lock, but you can force the existing lock to be released with the following command:
az storage blob lease break -b FILE_NAME -c CONTAINER_NAME --account-name STORAGEACCOUNT_NAME --account-key ACCESS_KEY
Thanks for the reply, I figured this would be the easiest solution.
I knew you could do it with another command execution, but I guess my bigger question is why is it so different between 0.12 and 0.13 and where in the TF files could I do this vs changing all my pipelines to add an additional step.
I don't have good answers on that. You shouldn't have to do this every time. I've only ever encountered this locking issue if terraform was in the middle of updating the state, and it somehow lost connection or my system crashed leaving it locked. Its really rare that this happens.
You should not need to deal with locking each time. The point of the lock is to prevent two terraform runs from happening at once with the same state. Are you able to reproduce this outside of the azure pipeline, on a local workstation?
Thanks for your question. I did try this outside of azure pipelines and receive the same error. I've created my own quick TF file to test and I can reproduce my results.
This is the debug output I get from 0.13.0
2020/08/12 17:20:24 [DEBUG] Azure Backend Response for https://storageaccountname.blob.core.windows.net/impact-terraform/terraform.tfstate:
HTTP/1.1 200 OK
Content-Length: 43976
Accept-Ranges: bytes
Content-Type: application/json
Date: Wed, 12 Aug 2020 21:20:23 GMT
Etag: "0x8D83EDF202DA166"
Last-Modified: Wed, 12 Aug 2020 16:45:31 GMT
Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
X-Ms-Access-Tier: Hot
X-Ms-Access-Tier-Inferred: true
X-Ms-Blob-Type: BlockBlob
X-Ms-Creation-Time: Mon, 10 Aug 2020 17:25:27 GMT
X-Ms-Lease-State: available
X-Ms-Lease-Status: locked
X-Ms-Request-Id: d918551a-a01e-00dc-6fee-70e661000000
X-Ms-Server-Encrypted: true
X-Ms-Version: 2018-11-09
Error: Error locking state: Error acquiring the state lock: 2 errors occurred:
* state blob is already locked
* blob metadata "terraformlockid" was empty
Terraform acquires a state lock to protect the state from being written
by multiple users at the same time. Please resolve the issue above and try
again. For most commands, you can disable locking with the "-lock=false"
flag, but this is not recommended.
But when I run the same code with my terraform0.12.29 binary, it blows right past all that and starts with the actual plan. I see where it says in the output
X-Ms-Lease-State: available X-Ms-Lease-Status: locked
and I figure that must be what terraform is now reading, but this works every time with the previous version.
Interestingly enough, if I do not use my working remote backend in azure and instead create a brand new remote backend from scratch, this works without issue. I could have missed, but I didn't see any instructions on patching the remote backends for an upgrade.
When I try to use the command above and break the lease I get an error, there is currently no lease on the blob.
What I have found now is that when I create a storage account WITHOUT Hierarchical namespace, the status of the blob once the write is finished is available and unlocked, when I create the storage account WITH Hierarchical namespace, the default state seems to be locked and available. The first run in a new state file always works, but all the jobs after that fail. Seems to be an issue with how the Hierarchical namespace works with storage accounts and lease states.
Do you have a way to check whether this happens exclusively with the azure state backend, or are you also seeing this with any other state backend? The AzureRM provider team maintains the state backend, and so I'm trying to triage which team needs to troubleshoot this. If it's common to all backends, it's a core issue, and if it's specific to that backend I'll send it to the azure team.
We only use azure here so I don't have anything easy to test with aws. I don't know that AWS has a hierarchical namespace, that's just my unfamiliarity with their service.
Looking at some of my other storage accounts, I see blobs in there that are unlocked and available. It seems to just be something happening with my terraform state file where it's available but remains locked.
I'm really glad I found this page. I've been having the same issue all week. I had thought it was because of the unique way this particular environment was setup. So i copied out the code onto my own machine, which was running 0.12.24 at the time, it worked fine. I then upgraded to 0.13.0, ran a TF init, which was fine, and then ran a plan, and got this exact issue. As with other people in this post, I viewed the least state in Azure, it was 'Available'. So i manually leased the blob, then released it. It was then in a 'Broken' state. If I then run plan/apply/destroy, it works without issue. There is definitely some issue between the new TF 0.13.0 binary and the Azure storage account. I'm really hopeful this gets fixed very quickly
I am just poking around commits for 0.13.0 and ran across this one. I’m not 100% sure what it’s trying to do but it involves local and remote state unlocking.
https://github.com/hashicorp/terraform/commit/86e9ba3d659176cd7ea969434e37cb064f23bb43
Also getting this with azurerm backend. It's a blocker right now for us to upgrade. Breaking the lease manually seems to help briefly but the issue recurs.
Same problem here after importing local state to an azure storage account
What I have found now is that when I create a storage account WITHOUT Hierarchical namespace, the status of the blob once the write is finished is available and unlocked, when I create the storage account WITH Hierarchical namespace, the default state seems to be locked and available. The first run in a new state file always works, but all the jobs after that fail. Seems to be an issue with how the Hierarchical namespace works with storage accounts and lease states.
Thank you for this tip. I was able to move past this here, by creating a new storage account with hierarchical namespace disabled, and migrating state files to it before upgrading to 0.13.
Also seeing this issue on Terraform v1.9.5 w/ azurerm v3.0.2. I manually created a new storage account/container via the Azure CLI, configured the back-end locally, and it never releases the lock after an operation completes.
For example:
Error: Error acquiring the state lock │ │ Error message: state blob is already locked │ Lock Info: │ ID: e9eb9ed7-a3bc-ae67-640d-ad4cb06fae7a │ Path: tfstate/terraform-dev.tfstate │ Operation: OperationTypePlan │ Who: MYCOMPUTER\MYNAME@MYCOMPUTER │ Version: 1.9.5 │ Created: 2024-08-26 22:42:07.991415 +0000 UTC │ Info: │ │ │ Terraform acquires a state lock to protect the state from being written │ by multiple users at the same time. Please resolve the issue above and try │ again. For most commands, you can disable locking with the "-lock=false" │ flag, but this is not recommended.
Terraform Version
Terraform Configuration Files
I am running everything through azure devops pipelines and doing replacetokens.
Debug Output
Crash Output
Expected Behavior
terraform plan should have run without issue.
Actual Behavior
Steps to Reproduce
terraform init
terraform validate
terraform plan
Additional Context
This runs via an Azure DevOps pipeline. I see many links talking about state locking if your backend supports it. I don't see any document telling me how to implement some sort of fix for this in my terraform code or pipeline. Am I do manually break the lease everytime I run code? That seems like more work than it should be. This same code ran yesterday on 0.12.28 and runs again when I change the version back to 0.12.28.
References