hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.29k stars 1.72k forks source link

Error while updating cloudfunction configuration #18900

Open ayoubhamaoui opened 1 month ago

ayoubhamaoui commented 1 month ago

Community Note

when I first run terraform apply it works and can create function successfully but when I change python code and try to update it it raise this error

Terraform Version & Provider Version(s)

Terraform v1.9.3 on windows_amd64

Affected Resource(s)

google_cloudfunctions_function

Terraform Configuration

# Create a Service Account for cloud function
data "google_project" "default" {
  project_id = var.project_id
}
resource "google_service_account" "cloud_function_sa" {
  account_id   = "sa-cloud-function-batch-llm"
  project = var.project_id
  display_name = "Cloud Function Service Account"
}

resource "google_service_account_iam_member" "cloud_function_sa_user" {
  service_account_id = google_service_account.cloud_function_sa.name
  role               = "roles/iam.serviceAccountAdmin"
  member             = "serviceAccount:${data.google_project.default.number}-compute@developer.gserviceaccount.com"
  depends_on = [
    google_service_account.cloud_function_sa

  ]
}

resource "google_project_iam_member" "member-role" {
  for_each = toset([
    "roles/eventarc.eventReceiver",
    "roles/run.invoker",
    "roles/storage.objectUser",
  ])
  role    = each.key
  member  = "serviceAccount:${google_service_account.cloud_function_sa.email}"
  project = var.project_id
}
resource "google_storage_bucket_iam_member" "function_input_bucket_reader" {
  bucket = var.trigger_bucket
  role   = "roles/storage.objectViewer"
  member = "serviceAccount:${var.existing_sa}"
}

# Reference to existing Cloud Function Bucket
data "google_storage_bucket" "existing_cloud_function_bucket" {
  name    = var.cloud_function_bucket_name
  project = var.project_id
}

# Archive source code
data "archive_file" "target" {
  type        = "zip"
  source_dir  = var.function_source_dir
  output_path = var.output_path
}

# Upload source code to existing Cloud Function bucket
resource "google_storage_bucket_object" "zip_target" {
  source       = data.archive_file.target.output_path
  content_type = "application/zip"
  name         = "src-${data.archive_file.target.output_md5}.zip"
  bucket       = data.google_storage_bucket.existing_cloud_function_bucket.name

  depends_on = [data.archive_file.target]
}

# Secret Manager secret for Kafka SSL CA certificate
resource "google_secret_manager_secret" "kafka_ssl_ca_cert" {
  secret_id = "${var.prefix}-kafka-ssl-ca-cert"
  project   = var.project_id

  labels = var.labels

  replication {
    auto {}
  }
}

# Secret Manager secret for Kafka SSL user certificate
resource "google_secret_manager_secret" "kafka_ssl_user_cert" {
  secret_id = "${var.prefix}-kafka-ssl-user-cert"
  project   = var.project_id

  labels = var.labels

  replication {
    auto {}
  }
}

# Create the Cloud Function
resource "google_cloudfunctions_function" "cloud_function_target" {
  name                  = var.function_name
  description           = var.function_description
  runtime               = var.function_runtime
  project               = var.project_id
  region                = var.region
  source_archive_bucket = data.google_storage_bucket.existing_cloud_function_bucket.name
  source_archive_object = google_storage_bucket_object.zip_target.name
  entry_point           = var.function_entry_point
  available_memory_mb   = var.function_memory
  service_account_email = var.existing_sa

  labels = var.labels

  environment_variables = {
    GCP_PROJECT                = var.project_id
  }

  event_trigger {
    event_type = "google.storage.object.finalize"
    resource   = var.trigger_bucket
  }
  depends_on = [
    google_secret_manager_secret_version.kafka_ssl_ca_cert_data,
    google_secret_manager_secret_version.kafka_ssl_user_cert_data,
    google_secret_manager_secret_version.kafka_ssl_user_key_data,
    google_storage_bucket_object.zip_target
  ]
}

Debug Output

│ Error: Error while updating cloudfunction configuration: googleapi: Error 400: Invalid function service account requested: xxxxx@xxxxx.iam.gserviceaccount.com. Please visit https://cloud.google.com/functions/docs/troubleshooting for in-depth troubleshooting documentation., badRequest │ │ with module.kafka_ssl_function.google_cloudfunctions_function.cloud_function_target, │ on modules\gcp_kafka_ssl_secrets_function\main.tf line 94, in resource "google_cloudfunctions_function" "cloud_function_target": │ 94: resource "google_cloudfunctions_function" "cloud_function_target" {

Expected Behavior

No response

Actual Behavior

No response

Steps to reproduce

  1. terraform apply

Important Factoids

No response

References

No response

b/359562162

dominicmarmont commented 1 month ago

I am seeing the exact same behaviour. It also happens if I define my own service account in service_account_email

dominicmarmont commented 1 month ago

@ayoubhamaoui I have just discovered that if I edit the cloud function in the UI once, then I am subsequently able to use terraform without getting this error. (the edit I made was to change the source code zip file though the UI)

ggtisc commented 1 month ago

Hi @ayoubhamaoui!

You have many variables that we don't have access to and can't know their values, but as I can see the error indicates that it is due to a bad configuration in the google_service_account. You could try with this example and if after this you still have issues please with us the complete code to check the values of your variables.

resource "google_storage_bucket" "bucket_18900" {
  name     = "bucket-18900"
  location = "US"
}

resource "google_storage_bucket_object" "bucket_object_18900" {
  name   = "index18900.zip"
  bucket = google_storage_bucket.bucket_18900.name
  source = "./utils/google_cloud_repository/index.zip"
}

resource "google_service_account" "service_account_18900" {
  account_id = "serviceaccount18900"
}

resource "google_cloudfunctions_function" "cf_18900" {
  name                  = "cloudfunctions-function-18900"
  description           = "something"
  runtime               = "nodejs16"
  source_archive_bucket = google_storage_bucket.bucket_18900.name
  source_archive_object = google_storage_bucket_object.bucket_object_18900.name
  entry_point           = "helloGET"
  available_memory_mb   = 128
  trigger_http          = true

  service_account_email = google_service_account.service_account_18900.email
}

Function (this is just an example because support is just focused on Google Cloud and Terraform, but you could use any language or function according to your requirements):

exports.helloGET = (req, res) => {
    res.status(200).send('Hello world!');
};
simonvanderveldt commented 1 month ago

@ggtisc We're having the same issue. Nothing has changed with the service account used for the Cloud Function, we're just updating the code. So this is not about any actual changes to the service account used for the Cloud Function. It's either something broken in this provider or in the GCP API.

Redacted example apply output:

# google_cloudfunctions_function.test_cloud_function[0] will be updated in-place
~ resource "google_cloudfunctions_function" "test_cloud_function" {
      id                    = "projects/<project>/locations/europe-west1/functions/test-cloud-function"
      name                  = "test-cloud-function"
    ~ source_archive_object = "test-cloud-function/0d98aaabb6c4e71eb18c42091f5b05f66f310aca.zip" -> "test-cloud-function/5990e6a5a089ea74ddd3875f4d771e8feedc0d38.zip"
      # (16 unchanged attributes hidden)

      # (1 unchanged block hidden)
  }

Plan: 0 to add, 1 to change, 0 to destroy.
google_cloudfunctions_function.test_cloud_function[0]: Modifying... [id=projects/<project>/locations/europe-west1/functions/test-cloud-function]
╷
│ Error: Error while updating cloudfunction configuration: googleapi: Error 400: Invalid function service account requested: cf-test-cloud-function@<project>.iam.gserviceaccount.com. Please visit https://cloud.google.com/functions/docs/troubleshooting for in-depth troubleshooting documentation., badRequest
│ 
│   with google_cloudfunctions_function.test_cloud_function[0],
│   on test-cloud-function.tf line 57, in resource "google_cloudfunctions_function" "test_cloud_function":
│   57: resource "google_cloudfunctions_function" "test_cloud_function" {

This is using provider hashicorp/google v4.85.0

Note how the service account hasn't changed yet the apply fails because of it. The Cloud Function has been running fine with this service account for over a year. We noticed this starting yesterday, but since we don't update our Cloud Functions daily and given this issue was created 10 days ago probably something changed on the provider or GCP side 10 days ago or before that.

ggtisc commented 1 month ago

You need to confirm that your service account conforms to API rules you can see here otherwise you can create a new service account or import an existing service account and use it like in the shared example

radupropellant commented 1 month ago

We're also having this issue, and interestingly enough, from all the GCP projects that we have, only the more recent ones (projects created in the last 2 months or so) seem to have this problem today, yesterday's apply was fine, but not today. Is it possible that some GCP projects are using a newer, incompatible version of the Cloud Functions Google API, and some older projects are using an older version?

I also saw a change in the google_cloudfunctions_function resource code, from a month ago, which adds a new variable, for build_service_account - https://github.com/hashicorp/terraform-provider-google/commit/693ad7be109b998372593216b55abf21084521c8 It's pretty small and doesn't seem to change anything for the service_account_email variable, but I can't help but wonder if somehow this change is the culprit.

ggtisc commented 1 month ago

Users are not sharing the complete code, but are experiencing similar scenarios. Everything looks good with the terraform registry examples, but also it is the possibility that this could be intermittent. Due to many reports it is possible that it requires a deep code analysis to verify the situation.

radupropellant commented 4 weeks ago

Users are not sharing the complete code, but are experiencing similar scenarios. Everything looks good with the terraform registry examples, but also it is the possibility that this could be intermittent. Due to many reports it is possible that it requires a deep code analysis to verify the situation.

honestly there's no code to link, at least in our case nothing changed in the terraform code, it's just the function source code that changed -- the apply started failing yesterday, but worked fine the day before

matthewrobertson commented 4 weeks ago

I suspect this was caused by some recent changes to support specifying a build service account. We are working on fix, but as a work around could you please try to specify the full resource identifier of the service account. I should of the following format:

projects/<PROJECT ID>/serviceAccounts/<SERVICE ACCOUNT EMAIL>
radupropellant commented 3 weeks ago

I got the error to show up, by directly calling the patch API, by taking the update function details from the GCP logs, and using the patch API page and it looks like the issue is from the build service account parameter, like I was implying above, and also @matthewrobertson's comment is tangentially correct, since the error is the fact that the default build service account from the terraform resource is not it the projects/<PROJECT ID>/serviceAccounts/<SERVICE ACCOUNT EMAIL> format

I've isolated the build service account parameter, instead of posting all the parameters for the whole function, like terraform uses the API: failing API call

And when setting the build service account to the proper format: passing API call

In conclusion, it looks like it's a problem in the GCP Functions API.

In the terraform provider code, the code that picks up the parameters for the existing function (https://github.com/hashicorp/terraform-provider-google/blob/main/google/services/cloudfunctions/resource_cloudfunctions_function.go#L684), then uses all of the existing parameters, and changes the updated parameters. The fact that the /get method returns the wrong format for the build service account is a issue in the GCP Functions API, and it seems like this change was done for the recently created GCP projects. get function call

Not sure if the terraform provider could filter only the changing parameters (the same ones from the updateMask), and not have all the parameters from the whole function, but it could be one idea.

Also, the error message from the API is not correct, since it references the service account, not the build service account.

Another point could be that the API should not validate the parameters that are not part of the update mask, but again, it's just an improvement idea, just like the one above in the terraform resource code.

I've also opened a thread with GCP support, and will relay the same information to them. These types of errors will still happen until either they fix the /get or /patch APIs, or the terraform provider resource code doesn't send the properties that don't change.

radupropellant commented 3 weeks ago

@ayoubhamaoui I have just discovered that if I edit the cloud function in the UI once, then I am subsequently able to use terraform without getting this error. (the edit I made was to change the source code zip file though the UI)

yes, indeed, it looks like editing the function manually sets the build service account with the proper format, and therefore the next terraform calls pick up on the correct format when calling "/get"

ayoubhamaoui commented 3 weeks ago

@ayoubhamaoui I have just discovered that if I edit the cloud function in the UI once, then I am subsequently able to use terraform without getting this error. (the edit I made was to change the source code zip file though the UI)

yes, indeed, it looks like editing the function manually sets the build service account with the proper format, and therefore the next terraform calls pick up on the correct format when calling "/get"

@radupropellant @dominicmarmont thanks for your suggestions, in my case I upgraded the version to :

terraform {
  required_version = ">= 1.9.5"
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.42.0"
    }
  }
}
  1. Add roles/cloudbuild.builds.builder to my custom service account
  2. After that I specified my custom service account to build_service_account attribute in ressource google_cloudfunctions_function as follow: build_service_account = "projects/${var.project_id}/serviceAccounts/${data.google_service_account.existing_service_account.email}"

Ref: cloudfunctions_function

radupropellant commented 3 weeks ago

if you can do that, you definitely should, until the issue in the API is resolved

in our case, we're using https://github.com/terraform-google-modules/terraform-google-event-function, which as of now doesn't have support for the build service account 😢

radupropellant commented 3 weeks ago

you could also use the default compute service account, but with fully qualified name, i.e. "projects/${var.project_id}/serviceAccounts/${data.google_project.project.number}-compute@developer.gserviceaccount.com"

however, this approach does need a data reference to the project, like

data "google_project" "project" {
  project_id = var.project_id
}
radupropellant commented 3 weeks ago

I've just talked with the support team from Google, and they said that the build service account for gen 1 shouldn't be optional, and most people are using it with a custom service account (not the default compute one)

also, they will be working on a PR in the https://github.com/terraform-google-modules/terraform-google-event-function module, but that will take a while, and if you're using it, you should clone it locally, or in your own terraform registry, and add the build service account variable