cyrilgdn / terraform-provider-postgresql

Terraform PostgreSQL provider
https://www.terraform.io/docs/providers/postgresql/
Mozilla Public License 2.0
356 stars 182 forks source link

1.19.0 replaces postgresql_grant all the time #321

Open alhroub opened 12 months ago

alhroub commented 12 months ago

Terraform Version

Terraform v1.5.2.

Affected Resource(s)

Terraform Configuration Files

terraform {
  required_providers {
    postgresql = {
      source  = "cyrilgdn/postgresql"
      version = "1.19.0"
    }
  }
}

provider "postgresql" {
  host            = "localhost"
  port            = 5432
  database        = "postgres"
  username        = "postgres"
  sslmode         = "disable"
  connect_timeout = 15
}

resource "postgresql_grant" "grant_public" {
  database    = "db1"
  role        = "public"
  schema      = "public"
  object_type = "schema"
  privileges  = ["USAGE"]
}

Expected Behavior

The resource should be updated in-place.

Actual Behavior

The resource is being replaced (destroyed then created).

Steps to Reproduce

Tested on PostgreSQL version 12.

  1. Create a brand new database called db1.
  2. Create a new main.tf file containing the snippet above. Modify the provider block to point to your local/test PostgreSQL instance.
  3. Run init/plan/apply
  4. connect to the database with psql and change the privileges on the public schema. For example, run
    GRANT ALL ON SCHEMA public TO public;
  5. Run plan/apply again. The apply will look like this (notice that the resource is being re-created):
    
    postgresql_grant.grant_public: Refreshing state... [id=public_db1_public_schema]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: -/+ destroy and then create replacement

Terraform will perform the following actions:

postgresql_grant.grant_public must be replaced

-/+ resource "postgresql_grant" "grant_public" { ~ id = "public_db1_public_schema" -> (known after apply) ~ privileges = [ # forces replacement

Plan: 1 to add, 0 to change, 1 to destroy. postgresql_grant.grant_public: Destroying... [id=public_db1_public_schema] postgresql_grant.grant_public: Destruction complete after 0s postgresql_grant.grant_public: Creating... postgresql_grant.grant_public: Creation complete after 0s [id=public_db1_public_schema]

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.


### Description of the issue
It looks like with version 1.19.0 (and more specifically this PR https://github.com/cyrilgdn/terraform-provider-postgresql/pull/135), the `postgresql_grant` resource gets re-created when there is a change.
Replacing the resource is not a good idea because the "destroy/create" operations are completely separate. i.e. they are not atomic which means (given the example in the "Steps to Reproduce" section above) for a short moment between the 2 operations the public role loses access to the public schema. If for any reason Terraform fails midway or it gets interrupted, users will end up not being able to access the objects in the public schema. This is what happens in the PostgreSQL log:

2023-07-11 14:50:05.989 UTC [1673] LOG: statement: BEGIN READ WRITE 2023-07-11 14:50:06.000 UTC [1673] LOG: statement: REVOKE ALL PRIVILEGES ON SCHEMA "public" FROM "public" 2023-07-11 14:50:06.001 UTC [1673] LOG: statement: COMMIT 2023-07-11 14:50:06.033 UTC [1675] LOG: statement: BEGIN READ WRITE 2023-07-11 14:50:06.043 UTC [1675] LOG: statement: REVOKE ALL PRIVILEGES ON SCHEMA "public" FROM "public" 2023-07-11 14:50:06.044 UTC [1675] LOG: statement: GRANT USAGE ON SCHEMA "public" TO "public" 2023-07-11 14:50:06.045 UTC [1675] LOG: statement: COMMIT

As you can see they are done in 2 different transactions.

This gets even **worse** if there is a `create_before_destroy` lifecycle defined, as the resource will be replaced in reverse order! It gets created (privileges are granted) and then destroyed (revoking everything on the public schema from the public role). The apply looks like this:

postgresql_grant.grant_public: Refreshing state... [id=public_db1_public_schema]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: +/- create replacement and then destroy

Terraform will perform the following actions:

postgresql_grant.grant_public must be replaced

+/- resource "postgresql_grant" "grant_public" { ~ id = "public_db1_public_schema" -> (known after apply) ~ privileges = [ # forces replacement

Plan: 1 to add, 0 to change, 1 to destroy. postgresql_grant.grant_public: Creating... postgresql_grant.grant_public: Creation complete after 0s [id=public_db1_public_schema] postgresql_grant.grant_public (deposed object 0a58f931): Destroying... [id=public_db1_public_schema] postgresql_grant.grant_public: Destruction complete after 0s

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.

And here is what you see in the PostgreSQL log (notice REVOKE ALL... is being executed last):

2023-07-11 14:52:14.044 UTC [1700] LOG: statement: REVOKE ALL PRIVILEGES ON SCHEMA "public" FROM "public" 2023-07-11 14:52:14.045 UTC [1700] LOG: statement: GRANT USAGE ON SCHEMA "public" TO "public" 2023-07-11 14:52:14.046 UTC [1700] LOG: statement: COMMIT 2023-07-11 14:52:14.081 UTC [1702] LOG: statement: BEGIN READ WRITE 2023-07-11 14:52:14.091 UTC [1702] LOG: statement: REVOKE ALL PRIVILEGES ON SCHEMA "public" FROM "public" 2023-07-11 14:52:14.091 UTC [1702] LOG: statement: COMMIT

This caused an outage for us recently!
The behaviour was different with version 1.18.0. The resource was updated in-place. If we pin the 1.18.0 version we get this apply output:

postgresql_grant.grant_public: Refreshing state... [id=public_db1_public_schema]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place

Terraform will perform the following actions:

postgresql_grant.grant_public will be updated in-place

~ resource "postgresql_grant" "grant_public" { id = "public_db1_public_schema" ~ privileges = [

Plan: 0 to add, 1 to change, 0 to destroy. postgresql_grant.grant_public: Modifying... [id=public_db1_public_schema] postgresql_grant.grant_public: Modifications complete after 0s [id=public_db1_public_schema]

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

and you see this in the PostgreSQL log:

2023-07-11 14:53:52.635 UTC [1734] LOG: statement: BEGIN READ WRITE 2023-07-11 14:53:52.644 UTC [1734] LOG: statement: REVOKE ALL PRIVILEGES ON SCHEMA "public" FROM "public" 2023-07-11 14:53:52.645 UTC [1734] LOG: statement: GRANT USAGE ON SCHEMA "public" TO "public" 2023-07-11 14:53:52.646 UTC [1734] LOG: statement: COMMIT



The old behaviour should be restored where the resource gets updated in-place as the revoke and grant are done in the same transaction as shown in the log above and as noted here:
https://github.com/cyrilgdn/terraform-provider-postgresql/blob/v1.19.0/postgresql/resource_postgresql_grant.go#L188-L189