genevieve / leftovers

Go cli & library for cleaning up orphaned IAAS resources.
Apache License 2.0
150 stars 22 forks source link

GCP IAM Policy: Error 409: There were concurrent policy changes. #83

Closed markstokan closed 5 years ago

markstokan commented 5 years ago

While destroying a GCP PKS deployment with three service account (opsman, pks-worker-node, pks-master-node), we found we had to run leftovers v0.48.0 three times to delete all of the service accounts. Only one service account was deleted per run.

The errors we received were:

IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-opsman@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Deleting...
[IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-pks-worker-node@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Deleting...
[IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-pks-master-node@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Deleting...
[IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-opsman@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Remove IAM Policy Bindings: Set Project IAM Policy: googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff., aborted
[IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-pks-master-node@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Remove IAM Policy Bindings: Set Project IAM Policy: googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff., aborted
[IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-pks-worker-node@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Deleted!
[DNS Managed Zone: grape-staging-zone] Deleting...
[DNS Managed Zone: grape-staging-zone] Deleted!

2 errors occurred:
    * [IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-opsman@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Remove IAM Policy Bindings: Set Project IAM Policy: googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff., aborted
    * [IAM Service Account: projects/pcf-toolsmiths-dev-1/serviceAccounts/grape-staging-pks-master-node@pcf-toolsmiths-dev-1.iam.gserviceaccount.com] Remove IAM Policy Bindings: Set Project IAM Policy: googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff., aborted

Would it be possible for leftovers to add retry logic in the case of a 409?

genevieve commented 5 years ago

Yeah, this must be because deleting each service account runs in it's own goroutine and they each have to grab the current version of the policy, modify it, and post it back quickly because of the way the gcp iam api is written 😢

Probably worth adding retries + running gcp iam api policy changes serially.

genevieve commented 5 years ago

Hey @markstokan! I made some changes, do you want to try merging them to your fork and seeing how that works or do you merge after I cut a release?

rowanjacobs commented 5 years ago

The Toolsmiths fork is automatically updated every time a commit is made to leftovers.

(It's not really a fork so much as a backup in case this repo goes down for whatever reason.)

I tried this with the new changes and it seems to have worked—at the very least, I didn't see any errors deleting it.

genevieve commented 5 years ago

Cool, thank you for the context Rowan. The pipeline is blocked in ci right now because of unrelated openstack things. Hopefully we can cut a release soon.

genevieve commented 5 years ago

For future reference, terraform template used to test this:

variable "service_account_key" {
  type = "string"
}

variable "project" {
  type = "string"
}

variable "region" {
  type = "string"
}

provider "google" {
  project     = "${var.project}"
  region      = "${var.region}"
  credentials = "${var.service_account_key}"
}

resource "google_service_account" "leftovers_1" {
  account_id   = "leftovers-acceptance-1"
  display_name = "Leftovers Acceptance Service Account 1"
}

resource "google_service_account" "leftovers_2" {
  account_id   = "leftovers-acceptance-2"
  display_name = "Leftovers Acceptance Service Account 2"
}

resource "google_service_account" "leftovers_3" {
  account_id   = "leftovers-acceptance-3"
  display_name = "Leftovers Acceptance Service Account 3"
}

resource "google_project_iam_member" "leftovers_1" {
  project = "${var.project}"
  role    = "roles/iam.serviceAccountUser"
  member  = "serviceAccount:${google_service_account.leftovers_1.email}"
}

resource "google_project_iam_member" "leftovers_2" {
  project = "${var.project}"
  role    = "roles/iam.serviceAccountUser"
  member  = "serviceAccount:${google_service_account.leftovers_2.email}"
}

resource "google_project_iam_member" "leftovers_3" {
  project = "${var.project}"
  role    = "roles/iam.serviceAccountUser"
  member  = "serviceAccount:${google_service_account.leftovers_3.email}"
}