hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.35k stars 9.49k forks source link

pg backend state lock always shows `user@localhost` holding the lock #32075

Open landorg opened 1 year ago

landorg commented 1 year ago

Terraform Version

Terraform v1.2.9

Terraform Configuration Files

terraform {
  backend "pg" {}
}

Debug Output

╷ │ Error: Error acquiring the state lock │ │ Error message: Workspace is already locked: playground2 │ Lock Info: │ ID: 2468f79c-e26d-054c-fa90-f74911480e92 │ Path: │ Operation: OperationTypePlan │ Who: user@localhost │ Version: 1.2.9 │ Created: 2022-10-25 06:20:29.367257207 +0000 UTC │ Info: │ │ │ Terraform acquires a state lock to protect the state from being written │ by multiple users at the same time. Please resolve the issue above and try │ again. For most commands, you can disable locking with the "-lock=false" │ flag, but this is not recommended. ╵

Expected Behavior

Show me who is really holding the lock.

Actual Behavior

It always shows myself as the one who is holding the lock no matter if it's one of my colleagues or a running pipeline.

Steps to Reproduce

terraform plan (on two machines)

Additional Context

No response

References

No response

apparentlymart commented 1 year ago

Hi @landorg! Thanks for reporting this.

It's been a while since I looked at this part of the Terraform so I had to look at the code to refresh my memory, but it appears that the information returned here is describing your attempt to create the lock, not describing the existing lock: the backend just echoes back the lock request as part of its error.

https://github.com/hashicorp/terraform/blob/f5de1099ff0e6a2cfd99716334c89558a481792c/internal/backend/remote-state/pg/client.go#L108-L111

(info in the above is an object passed in to the Lock method by its caller, so it doesn't include any information about the active lock.)

Although some state storage backends do store all or part of the lock information in the remote system when acquiring a lock, it doesn't seem like the pg backend in particular is capable of doing that because it's essentially just wrapping the advisory lock functions.

This backend is maintained by @remilapeyre, so I will have to defer on whether it's actually possible to serialize and store the lock information in the database and then re-inflate it for inclusion in the error message.

If that isn't possible then I would suggest that we change the backend to return a generic error type rather than statemgr.LockError specifically. The special lock error type is intended to describe the information about the lock that's already held so that the UI can give feedback about who is holding the lock, and so the UI gets confused (as we can see here) if the state storage backend returns information about the current locking request instead.


In case it's helpful context, here's the codepath in the S3 backend where it tries to retrieve the existing lock info out of the DynamoDB table whenever it's reporting a locking error:

https://github.com/hashicorp/terraform/blob/f5de1099ff0e6a2cfd99716334c89558a481792c/internal/backend/remote-state/s3/client.go#L240-L251