Error 412 while updating gcp.sql.DatabaseInstance

bmyers427 commented 2 years ago

What happened?

I'm attempting to enable the High Availability setting for an existing GCP Postgres SQL instance. Adding the availability type parameter returns the following error

  pulumi:pulumi:Stack (app):
    error: update failed

  gcp:sql:DatabaseInstance (db-server):
    error: 1 error occurred:
        * updating urn:pulumi:test::app::gcp:sql/databaseInstance:DatabaseInstance::db-server: 1 error occurred:
        * Error, failed to update instance settings for : googleapi: Error 412: Condition does not match., staleData

Steps to reproduce

Create a DB instance

export const dbInstance = new gcp.sql.DatabaseInstance("db-server", {
  databaseVersion: "POSTGRES_11",
  region: "us-central1",
  settings: {
    tier: "db-g1-small",
    diskAutoresize: false,
    backupConfiguration: {
      enabled: false,
      location: "us-central1",
      pointInTimeRecoveryEnabled: false,
      startTime: "01:00",
      transactionLogRetentionDays: 7,
      backupRetentionSettings: {
        retainedBackups: 14,
      },
    },
    userLabels: {
      "env": "test",
    }
  },
  deletionProtection: false,
});

Add availabilityType to the instance settings

export const dbInstance = new gcp.sql.DatabaseInstance("db-server", {
  databaseVersion: "POSTGRES_11",
  region: "us-central1",
  settings: {
    tier: "db-g1-small",
    diskAutoresize: isProduction,
    availabilityType: "REGIONAL",
    backupConfiguration: {
      enabled: false,
      location: "us-central1",
      pointInTimeRecoveryEnabled: false,
      startTime: "01:00",
      transactionLogRetentionDays: 7,
      backupRetentionSettings: {
        retainedBackups: 14,
      },
    },
    userLabels: {
      "env": "test",
    }
  },
  deletionProtection: false,

Run pulumi up

Expected Behavior

The existing GCP SQL database instance availability is changed from ZONAL to REGIONAL.

Actual Behavior

The update fails and returns the error above. The following data was also in the logs:

I0416 18:13:19.856905     770 eventsink.go:59] {
I0416 18:13:19.857036     770 eventsink.go:59]   "error": {
I0416 18:13:19.857204     770 eventsink.go:59]     "code": 412,
I0416 18:13:19.857382     770 eventsink.go:59]     "message": "Condition does not match.",
I0416 18:13:19.857524     770 eventsink.go:59]     "errors": [
I0416 18:13:19.857688     770 eventsink.go:59]       {
I0416 18:13:19.857862     770 eventsink.go:59]         "message": "Condition does not match.",
I0416 18:13:19.858078     770 eventsink.go:59]         "domain": "global",
I0416 18:13:19.858259     770 eventsink.go:59]         "reason": "staleData",
I0416 18:13:19.858431     770 eventsink.go:59]         "location": "If-Match",
I0416 18:13:19.858601     770 eventsink.go:59]         "locationType": "header"
I0416 18:13:19.858765     770 eventsink.go:59]       }
I0416 18:13:19.858935     770 eventsink.go:59]     ]
I0416 18:13:19.859112     770 eventsink.go:59]   }
I0416 18:13:19.859292     770 eventsink.go:59] }

Versions used

Pulumi version: 3.29.1

Additional context

Adding anything appears to trigger the error, including explicitly entering the default value availabilityType: "ZONAL"

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

guineveresaenger commented 2 years ago

Could this be a dup of https://github.com/hashicorp/terraform-provider-google/issues/7200? This looks like underlying GCP behavior to me. Specifically you might try something like this workaround?

yehudamakarov commented 2 years ago

how does one do this:

"google_sql_database_instance" "failover_replica" :
lifecycle { 
    ignore_changes = [settings[0].tier] 
}

in pulumi?

saberistic commented 2 years ago

I am also experiencing this trying to add a database flag

@@ -7,6 +7,10 @@ export const createInstance = (projectId: Output<string>, name: string, id: stri
         region: "us-west1",
         databaseVersion: "MYSQL_8_0",
         settings: {
+            databaseFlags: [{
+                name: "cloudsql_iam_authentication",
+                value: "on"
+            }],
             tier,
         },
         deletionProtection: true,

    error: 1 error occurred:
        * updating urn:pulumi:sandbox::baxus-devops::gcp:sql/databaseInstance:DatabaseInstance::services: 1 error occurred:
        * Error, failed to update instance settings for : googleapi: Error 412: Condition does not match., stale

h4ckroot commented 2 years ago

Same here! .. any clue please?

coofercat commented 2 years ago

FWIW, I just got the same by changing enable_default_user from true to false.

nordringrayhide commented 2 years ago

Same, tried to replace privateNetwork

ipConfiguration: { ipv4Enabled: true, privateNetwork: _default.id, },

got: Error, failed to update instance settings for : googleapi: Error 412: Condition does not match., staleData

coldwell commented 2 years ago

Happening for me also. Trying to update the tier attribute.

Pulumi v3.38.0, python pulumi-gcp v6.34.0

gcp:sql:DatabaseInstance (mydb):
  error: 1 error occurred:
    * updating urn:pulumi:staging::pulumi_iac::gcp:sql/databaseInstance:DatabaseInstance::mydb: 1 error occurred:
    * Error, failed to update instance settings for : googleapi: Error 412: Condition does not match., staleData

moravron commented 2 years ago

Same for me when trying to update db settings..

aetheryx commented 1 year ago

Running a pulumi refresh fixed this for us

kly630 commented 1 year ago

I'm currently having this happen on a project as well. We've created an instance with no other replicas and for some reason pulumi or terraform is trying to set the settingsVersion field all the way back to version 5 which appears to be the value it has in the stack state. Unable to update metadata, version mismatch (old: 23, new: 5)

Scanning github I see a few people have run into this as well in the terraform provider github project so I am willing to chalk it up to terraform issues and to try to do a pulumi refresh.

Edit: And i think I might have found the relevant line in the terraform provider: // Instance.Patch operation on completion updates the settings proto version by +8. As terraform does not know this it tries // to make an update call with the proto version before patch and fails. To resolve this issue we update the setting version // before making the update call. instance.Settings.SettingsVersion = int64(_settings["version"].(int)) https://github.com/modular-magician/terraform-provider-google/blob/f130e63953395e83b5173295f444eaed57370293/google/resource_sql_database_instance.go Updated august 18th of last year in terraform, so it's probably time for an upgrade to the latest GCP provider. I'll add one update if it fixed it for me or not.

Edit 2: That did the trick for me, I upgraded to pulumi gcp version 6.49.0. Anything after august 2022 should work though given when this terraform provider was updated.

AaronFriel commented 1 year ago

@kly630 thanks for identifying this was fixed, sorry for not updating the issue accordingly.

same-id commented 1 year ago

Still have this problem with the latest GCP provider

Not sure if someone updated the settings outside of Pulumi and then the version (SettingsVersion) got bumped - which now fails the update call (like etag changes)

AaronFriel commented 1 year ago

@same-id if this is the same issue, is it resolved by doing a pulumi refresh before running the update?

Could you also tell us what the diff is on your resource or provide a minimal repro?

same-id commented 1 year ago

pulumi refresh solves this of course since the SettingsVersion is getting bumped to the latest version.

If I understand correctly, changing the settings in the UI (and then changing them back) outside of Pulumi will reproduce this 100% - since GCP wants to get the exact SettingsVersion. So this will require a refresh even if nothing has actually changed.

However I think this has also happened on our prod DB that is always being touched using Pulumi. I suspect that there are some wierd conditions there.

Updating the GCP instance does not happen in one API call, so for example if we both add a replica for that instance and also update the settings, so these changes happen in two distinct API calls to GCP since changing both at once is not supported - you can see this code in the terraform provider.

However after creating the replica, for some reason the settingsVersion changes too in more than +1 jumps in value, so the terraform provider refreshes the state after each of those internal operations (see all the calls to resourceSqlDatabaseInstanceRead within the Update function: https://github.com/hashicorp/terraform-provider-google/blob/0720a385025d7928bbba43050b13947474e5a877/google/resource_sql_database_instance.go#L1518)

However I wonder what happens if anything fails between those updates, whether in intermediate state is being saved or not.

adriangb commented 1 year ago

I'm having this issue with the latest provider version. I'm not even trying to change the settings, I started getting this if I run pulumi refresh on my stack which decides it needs to update the CloudSQL instance but subsequently fails to do so, even if I bump the settingsVersion.

lukehoban commented 1 year ago

Reopening given continued reports of issues here.

@adriangb Could you share more details of the steps and outputs you see on each step? What do you mean by "fails to do so", and what is the diff that "refresh" shows?

rquitales commented 1 year ago

I've tried to reproduce this over the last few days without success. pulumi refresh was able to help resolve the settings version mismatch in my testing. @adriangb if you could provide additional details about your setup, that would be great! Thanks.

adriangb commented 1 year ago

So I'm a bit hesitant to do stuff because it's going to make me spin all of my infra.

Here's the output from a pulumi refresh:

~ google-native:sqladmin/v1:Instance: (update)
        [id=v1/projects/pydantic-platform/instances/web-db-0596b8f]
        [urn=urn:pulumi:dev::logfire::google-native:sqladmin/v1:Instance::web-db]
        [provider=urn:pulumi:dev::logfire::pulumi:providers:google-native::default_0_14_0::988302c2-20d1-482b-a918-468198c3fae0]
        --outputs:--
      ~ etag                      : "430197ad54706acf20dd147542b5fdd3609b5bfcde38bdb74feca13659f138f8" => "f1186620bc73c19307280952ce196d123dd64b1577075d5c3f1758431f863da8"
      ~ settings                  : {
            activationPolicy         : "ALWAYS"
            availabilityType         : "REGIONAL"
            backupConfiguration      : {
                backupRetentionSettings    : {
                    retainedBackups: 7
                    retentionUnit  : "COUNT"
                }
                enabled                    : true
                kind                       : "sql#backupConfiguration"
                startTime                  : "15:00"
                transactionLogRetentionDays: 7
            }
            connectorEnforcement     : "NOT_REQUIRED"
            dataDiskSizeGb           : "10"
            dataDiskType             : "PD_SSD"
            databaseFlags            : [
                [0]: {
                    name : "cloudsql.iam_authentication"
                    value: "on"
                }
            ]
            deletionProtectionEnabled: false
            ipConfiguration          : {
                authorizedNetworks: []
                ipv4Enabled       : true
            }
            kind                     : "sql#settings"
            locationPreference       : {
                kind: "sql#locationPreference"
                zone: "us-central1-c"
            }
            pricingPlan              : "PER_USE"
            replicationType          : "SYNCHRONOUS"
          ~ settingsVersion          : "4" => "5"
            storageAutoResize        : true
            storageAutoResizeLimit   : "0"
            tier                     : "db-f1-micro"
        }

The settingsVersion in my code is 3 so I don't even know where 4 came from.

My experience has been that if I accept this update then when I try to run apply I'll get an error.

I'm happy to do that process once, but I'd like to know exactly what diagnostics you'll want before I destroy all of our infra.

SGudbrandsson commented 1 year ago

Just had this issue while updating an IP list. Refreshing did the trick, but should not be required.

AaronFriel commented 1 year ago

I've root caused this issue, thanks to @kly630 for linking to the original fix in the TF provider. In the fixed version, the provider reads the settings version and sets it if the database version field is set, which it almost always is:

https://github.com/modular-magician/terraform-provider-google/blob/f130e63953395e83b5173295f444eaed57370293/google/resource_sql_database_instance.go#L1377-L1391

The current version of the provider changes the condition for performing a read to if promoteReadReplicaRequired, which appears to be true only when promoting a read replica:

https://github.com/hashicorp/terraform-provider-google-beta/blob/main/google-beta/services/sql/resource_sql_database_instance.go#L1853-L1867

I think that lines 1853 to 1856 should be moved outside of the conditional block, as I don't see a circumstance in which we would not want to make the settings version match the remote.

adriangb commented 1 year ago

@AaronFriel is there a timeline for fixing this? I just hit this again today when enabling point-in-time recovery and I'll probably drop my database and rebuild it (again). This is going to be a blocker for us to move into production with Pulumi.

AaronFriel commented 1 year ago

@adriangb I've root caused the issue and we're deciding how to proceed; whether to make the PR upstream or to apply a patch on our fork. I expect we should have an answer for you soon.

pulumi / pulumi-gcp