Open elodani opened 7 years ago
Hi @elodani,
Sorry you're we having an issue with this. The design of the state lock is meant to leave the lock when possible in the case of an abnormal exit. If you hit Ctrl+C only once, the process should have exited normally and cleaned up the lock. If you hit that twice, it forces the process to quit immediately and no cleanup can be done.
If the process didn't exit normally, there is no guarantee that the saved state is correct, and manual intervention may be required. Having the lock present gives a little more safety around accessing a corrupted state.
I'm going to keep this open as a feature request, since the implementation is possible, and not completely out of line with the semantics of other backends like consul.
In the meantime, there is a terraform force-unlock
command to handle the situation for most backends, where you pass in the lock ID from the error message to manually remove a lock.
I think timeout might be tricky to implement, since - like mentioned above - there is no guarantee for the state to be consistent.
An easy win - what I would love very much - is to lock the statefile during plan
only for the time of acquiring a state snapshot. (I think Terraform already uses some cache.)
Anyway, I also agree it's just an improvement. We only had this problem once while playing around with the tool.
Not sure what the issue to set lock timeout and remove it on the next run in case time have expired yet lock is still there?!
Terraform Version 0.9.8
If you have more than a few resources stored in your remote state (I have 2 Elastic Beanstalk and some IAM resources, in total 12) the
terraform plan
takes long enough (to be easily interrupted), and you have an S3 backend with locks, you can lock yourself out by calling aterraform plan
and getting it interrupted by anything.Then you cannot access the remote statefile anymore, because the lock is not released. I waited 30+ minutes for some kind of timeout to kick in, but it is either longer than that, or the lock is permanent.
I could only get it work again after issuing
terraform destroy -lock=false
than destroy and recreate the S3 bucket and dynamoDB that my backend uses.It has happened because my session was interrupted by the system, but reproducible with Ctrl+C.
A more optimistic locking, or a reasonable timeout would be better. (especially annoying if you automatize terraform calls, you cannot know that you are really that unlucky, that someone locks you out always, or it's broken and safe to use -lock=false)
Terraform S3 Backend Config
Expected Behavior
After some time, the lock is released (especially because it was only a plan operation)
Actual Behavior
when I tried to use the remote state again, got error, although nobody used the file:
Steps to Reproduce