hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.25k stars 1.7k forks source link

Updating Cloud Functions' source code requires changing zip path #1938

Open geekflyer opened 5 years ago

geekflyer commented 5 years ago

Hi there,

I'm trying to create a cloud function via terraform (which in this particular example forwards error logs to slack, but that's irrelevant for the issue).

The problem is it seems impossible to update a cloud functions source code after its initial deployment via terraform.

As an example below is my hcl config code. You can see that as part of that code I'm packaging a node.js app located under ./app into a zip file, upload it to GCS and then use this as source for the cloud function. Whenever I change something in the source code under ./app terraform will rezip and upload the new archive to GCS. However the corresponding cloud function does not reload the source code from GCS. This is because none of the input params of the cloud function resource has been changed. In the AWS lambda resource they use an attribute source_code_hash to trigger updates to the function resource when the source code has changed.

The google_cloud_function resource doesn't have any attribute like that so I cannot trigger an update to the resource. I tried embedding the hash into the description or labels of the resource to trigger an update, and while this creates a new version, that new version still doesn't reload the new source code. IMHO that makes the current terraform cloud function resource useless in practice. It can only be used to create an initial cloud function but not for updates.

Expectation:

Please add an attribute source_code_hash or similar to the cloud function resource to allow updates of the source code via terraform.

Terraform Version

Terraform v0.11.7
+ provider.archive v1.1.0
+ provider.google v1.16.2

Affected Resource(s)

Please list the resources as a list, for example:

Terraform Configuration Files

main.tf

locals {
  error_log_filter = <<EOF
    resource.type="k8s_container"
    resource.labels.cluster_name="api-cluster-prod-b"
    severity>=ERROR
    EOF

  function_name    = "post-error-logs-to-slack"
  functions_region = "us-central1"
}

terraform {
  backend "gcs" {}
}

provider "google" {
  project = "${var.gcp_project}"
  region  = "${var.gcp_region}"
  version = "~> 1.16.2"
}

provider "archive" {
  version = "~> 1.1.0"
}

resource "google_pubsub_topic" "error_logs" {
  name = "error-logs"
}

resource "google_logging_project_sink" "error_logs_sink" {
  name        = "error-logs-sink"
  destination = "pubsub.googleapis.com/projects/${var.gcp_project}/topics/${google_pubsub_topic.error_logs.name}"
  filter      = "${local.error_log_filter}"
}

resource "google_storage_bucket" "functions_store" {
  name     = "solvvy-prod-functions"
  location = "${local.functions_region}"
}

data "archive_file" "function_dist" {
  type        = "zip"
  source_dir  = "./app"
  output_path = "dist/${local.function_name}.zip"
}

resource "google_storage_bucket_object" "error_logs_to_slack_function_code" {
  name   = "${local.function_name}.zip"
  bucket = "${google_storage_bucket.functions_store.name}"
  source = "${data.archive_file.function_dist.output_path}"
}

resource "google_cloudfunctions_function" "post-error-logs-to-slack" {
  name                  = "post-error-logs-to-slack"
  description           = "[Managed by Terraform] This function gets triggered by new messages in the ${google_pubsub_topic.error_logs.name} pubsub topic"
  available_memory_mb   = 128
  source_archive_bucket = "${google_storage_bucket_object.error_logs_to_slack_function_code.bucket}"
  source_archive_object = "${google_storage_bucket_object.error_logs_to_slack_function_code.name}"
  entry_point           = "sendErrorToSlack"
  trigger_topic         = "${google_pubsub_topic.error_logs.name}"
  region                = "${local.functions_region}"
}

b/249753001

Chupaka commented 5 years ago

1.17.0 (August 22, 2018) IMPROVEMENTS: cloudfunctions: Add support for updating function code in place (#1781)

So looks like you just need to update google provider :)

geekflyer commented 5 years ago

Nope, #1781 doesn't really solve my issue. With #1781 we only gain the ability to update the functions source if I change at the same time the path of the zip archive in GCS. In my case I only want to change the content of the .zip blob in-place in GCS (which won't trigger an update of the function currently) and not update its location / path all the time.

I can change the blob path dynamically by appending the content hash to it's path, but that's imho just an ugly workaround :).

locals {
  // we append the app hash to the filename as a temporary workaround for https://github.com/terraform-providers/terraform-provider-google/issues/1938
  filename_on_gcs = "${local.function_name}-${lower(replace(base64encode(data.archive_file.function_dist.output_md5), "=", ""))}.zip"
}
paddycarver commented 5 years ago

Sadly, I think that may be the only option we have, at the moment. :/ I don't see anything in the Cloud Functions API that suggests you can trigger a new deployment without changing the path of the deployed code. In theory, we could probably add something on the Terraform side to make this nicer, like a source_code_hash field, but in reality all that would do is include the source_code_hash in the file path behind the scenes, instead of making you do it. :/

I've opened an issue upstream about this.

pmoriarty commented 5 years ago

The REST API (https://cloud.google.com/functions/docs/reference/rest/v1/projects.locations.functions) seems to support versioning. Repeated attempts to deploy increment the returned versionId.

Would it be possible to detect that the bucket object is being updated and re-deploy cloud functions which depend on it?

paddycarver commented 5 years ago

Not with Terraform, unfortunately--the storage object resource's changes aren't visible to the function resource unless the field that changes is interpolated.

A simple workaround that should work is changing name = "${local.function_name}.zip" in the source archive object to name = "${local.function_name}.${data.archive_file.function_dist.output_base64sha256}.zip" or name = "${local.function_name}.${data.archive_file.function_dist.output_md5}.zip", which would include the SHA 256 sum of the contents of the zip file (or the MD5 sum) in the filename of the object, which would then make cloud functions notice that the source has changed and deploy a new version.

[edit] Which was already pointed out. Oops. Sorry about that.

Kentzo commented 5 years ago

This workaround is problematic because function's name cannot exceed 48 characters.

quinn commented 5 years ago

FWIW, I would like to thumbs down this enhancement, I don't think this is a reasonable feature request, given its not clear what heuristic would be used to detect a change. Leaving it up to the user is probably more reliable.

timwsuqld commented 5 years ago

I believe serverless puts each zip file in a folder with the upload timestamp. Then it updates the functions source URL to trigger the redeploy. @quinn I don't see why this is an unreasonable request, as long as it's well documented how it works. Currently, a user spends a little while trying to get a function to update after they initially deploying it, before finding this issue and working out they need to change the file URL for each deploy. Making this automatic (src hash) makes a lot of sense.

quinn commented 5 years ago

@timwsuqld I see what you mean, but I think that is just part of the learning curve of cloud functions. You have to do the exact same thing with lambda + cloudformation (except cloudformation does not provide a way to zip and fingerprint automatically the way that terraform does).

Here are the issues with this that I see:

  1. There's no easy way to detect changes. What if an imported file changes? or the version of a dependency? Not just the entrypoint file could change. Timestamps could work too, but that has the downside of always releasing a new version. Ideally, the config would be declarative and idempotent.
  2. The zip data resource doesn't know its targeting a cloud function, so there doesn't seem to be a good place to implement this functionality. All the pieces involved need to be as decoupled as possible, and changing the path that gets used in the bucket based on the resource that is using it implies tight coupling.
ASRagab commented 4 years ago

Isn't a source_code_hash attribute on the resource, exactly how the aws_lambda terraform resource works? While not perfect, it feels like an 85% solution to a pretty common issue (code changes), why should the name of the deploy artifact have to change?

bradenwright commented 4 years ago

Any update on this, issue has been open over a year. Workaround gets it going but it's hacky

LudovicTOURMAN commented 4 years ago

June 2020, I still encountered this issue :roll_eyes:

I used to hack it by adding the md5 sources as bucket file prefix such as gs://my-bucket/cloud-functions/my-function.zip#a1b2c3d4e5f6.

Here is an example:

data "archive_file" "function_archive" {
  type        = "zip"
  source_dir  = var.source_directory
  output_path = "${path.root}/${var.bucket_archive_filepath}"
}

resource "google_storage_bucket_object" "archive" {
  name                = format("%s#%s", var.bucket_archive_filepath, data.archive_file.function_archive.output_md5)
  bucket              = var.bucket_name
  source              = data.archive_file.function_archive.output_path
  content_disposition = "attachment"
  content_encoding    = "gzip"
  content_type        = "application/zip"
}

resource "google_cloudfunctions_function" "function" {
  name = var.cloud_function_name

  source_archive_bucket = google_storage_bucket_object.archive.bucket
  source_archive_object = google_storage_bucket_object.archive.name

  available_memory_mb   = var.cloud_function_memory
  trigger_http          = var.cloud_function_trigger_http
  entry_point           = var.cloud_function_entry_point
  service_account_email = var.cloud_function_service_account_email
  runtime               = var.cloud_function_runtime
  timeout               = var.cloud_function_timeout
}

Hope that resource google_cloudfunctions_function would evolved in order to check sha/md5 changes by itself and re-deploy new code :pray:

dinvlad commented 4 years ago

Yeah we do that too now, and even turned on lifecycle policy for the archive bucket so the old archives get deleted after 1 day!

dinvlad commented 4 years ago

The only caveat here is some CIs (e.g. Google Cloud Build) mess up permissions on the files to be archived, so one may need to fix them before running this (we do that via an external script inside the TF template). Otherwise, the hash is not reproducible between running it from a local machine and the CI environment.

abdulloh-abid commented 4 years ago

Tried with terraform timestamp() instead of base64sha256 , it worked fine, but this is not right way, HashiCorp should come up something like source_code_hash for Cloud Functions too

locals {
  timestamp   = formatdate("YYMMDDhhmmss", timestamp())
  func_name  = "myFunc"
}

data "archive_file" "function_archive" {
  type         = "zip"
  source_dir   = "path/to/source-folder"
  output_path  = "${local.func_name}.zip"
}

resource "google_storage_bucket_object" "archive" {
  name   = "${local.func_name}_${local.timestamp}.zip"
  bucket = var.bucket_name
  source = data.archive_file.function_archive.output_path
}

resource "google_cloudfunctions_function" "function" {
  name        = "${local.func_name}"
  description = "My function"
  runtime     = "nodejs10"
  available_memory_mb   = 128
  source_archive_bucket = var.bucket_name
  source_archive_object = google_storage_bucket_object.archive.name
  trigger_http          = true
  timeout               = 60
  entry_point           = "helloGET"
  labels = {
    my-label = "my-label-value"
  }
  environment_variables = {
    MY_ENV_VAR = "my-env-var-value"
  }
}
dinvlad commented 4 years ago

@p4309027 I think data.archive_file.function_archive.output_md5 actually solves this problem in this case.

simov commented 3 years ago

I think another way of solving this is by using the random provider with a keeper:

resource "random_string" "name" {
  length = 8
  special = false
  upper = false
  keepers = {
    md5 = filemd5(var.package)
  }
}

resource "google_storage_bucket" "bucket" {
  name     = var.bucket
  location = var.region
}

resource "google_storage_bucket_object" "package" {
  name   = "${var.lambda}-${random_string.name.result}.zip"
  bucket = google_storage_bucket.bucket.name
  source = var.package
}
Tony-Proum commented 3 years ago

Hi, I use this little workaround

resource "google_cloudfunctions_function" "my-function" {
  name = "my-function-${regex("(?:[a-zA-Z](?:[-_a-zA-Z0-9]{0,61}[a-zA-Z0-9])?)",
google_storage_bucket_object.my_bucket.md5hash)}"
  ...
}

As the name of the cloud function force the creation of a new version, I just injected the md5 hash at the end of its name. The regex allows to only outputs authorized characters. (if the name became too long I also suggest to use something like substr function to only use 8 or 10 characters from the md5)

rileykarson commented 3 years ago

So, a stray thought on my part on how we might (emphasis might- I've got, like, 30% confidence here) be able to support this without hacking around with the function name or the random provider.

First, the object replaces itself when the file on disk changes:

$ terraform apply
google_storage_bucket.bucket: Refreshing state... [id=dasdasdasdsadasdsadasdasdsadsadasdas]
google_storage_bucket_object.archive: Refreshing state... [id=dasdasdasdsadasdsadasdasdsadsadasdas-index.zip]
google_cloudfunctions_function.function: Refreshing state... [id=projects/graphite-test-rileykarson/locations/us-central1/functions/myfunc2]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # google_storage_bucket_object.archive must be replaced
-/+ resource "google_storage_bucket_object" "archive" {
        bucket         = "dasdasdasdsadasdsadasdasdsadsadasdas"
      ~ content_type   = "application/zip" -> (known after apply)
      ~ crc32c         = "/XMEsw==" -> (known after apply)
      ~ detect_md5hash = "pQQSXDmQAX3LZ8asp48hKg==" -> "different hash" # forces replacement
      ~ id             = "dasdasdasdsadasdsadasdasdsadsadasdas-index.zip" -> (known after apply)
      ~ md5hash        = "pQQSXDmQAX3LZ8asp48hKg==" -> (known after apply)
      ~ media_link     = "https://storage.googleapis.com/download/storage/v1/b/dasdasdasdsadasdsadasdasdsadsadasdas/o/index.zip?generation=1614880679433188&alt=media" -> (known after apply)
      - metadata       = {} -> null
        name           = "index.zip"
      ~ output_name    = "index.zip" -> (known after apply)
      ~ self_link      = "https://www.googleapis.com/storage/v1/b/dasdasdasdsadasdsadasdasdsadsadasdas/o/index.zip" -> (known after apply)
        source         = "index.js.zip"
      ~ storage_class  = "STANDARD" -> (known after apply)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Changes to Outputs:
  ~ md5 = "pQQSXDmQAX3LZ8asp48hKg==" -> (known after apply)

Second, we could consider adding a keeper field on the Cloud Function. With a field like object_md5_keeper, users would interpolate on the md5 value of the object in addition to the object name. That will (well, should) be updated when the underlying object changes. That keeper field would trigger an update on the resource that wouldn't have otherwise happened (as the name of the file didn't change, Terraform doesn't see an update). As long as Terraform processes the object first and the function second, it might actually work. It's not a guarantee that's the processing order though, or that I've got my understanding of how Terraform will handle this case all the way correct. I've been surprised by cases like https://github.com/hashicorp/terraform-plugin-sdk/issues/122 in the past.

bgmonroe commented 3 years ago

If nothing else (or until a solution is implemented), this significant usage caveat should be documented in the google_cloudfunctions_function resource page.

jamiet-msm commented 2 years ago

@p4309027 I think data.archive_file.function_archive.output_md5 actually solves this problem in this case.

I agree. We do this, it works fine. I wouldn't call it a hack

leochoo commented 2 years ago

Still no update on this?

jeffbryner commented 2 years ago

For those coming here via a search, it's worth noting that changing the name of the function has a drawback of also changing the http trigger URL used for a function triggered via http calls. Troublesome for slackbots, api endpoints, etc.

You can trigger a redeploy of source by using the same dynamic md5hash in the zip file name ala:

# source code zip file to send to the cloud function
data "archive_file" "source_zip" {
  type        = "zip"
  source_dir  = "${path.root}/source/"
  output_path = "${path.root}/function.zip"
}

# storage bucket for our code/zip file
resource "google_storage_bucket" "function_bucket" {

  project                     = google_project.target.project_id
  name                        = local.function_bucket_name
  location                    = var.default_region
  uniform_bucket_level_access = true
  force_destroy               = true
  versioning {
    enabled = true
  }
}

# upload zipped code to the bucket
resource "google_storage_bucket_object" "function_zip" {
  name   = format("%s-%s.zip", local.function_name, data.archive_file.source_zip.output_md5)
  bucket = google_storage_bucket.function_bucket.name
  source = "${path.root}/function.zip"
}

Not sure if it matters from the standpoint of redeploying the function itself, but I also changed the description rather than the name of the function to match the hash just to help orient the casual observer to where the source might be.

resource "google_cloudfunctions_function" "project_function" {
  project               = google_project.target.project_id
  name                  = local.function_name
  description           = format("%s-%s", local.function_name, data.archive_file.source_zip.output_md5)
markoilijoski commented 2 years ago

This workaround is problematic because function's name cannot exceed 48 characters.

You can use substr function to shorten the hash if you exceed characters

example:


resource "google_storage_bucket_object" "error_logs_to_slack_function_code" {
  name   = substr("${data.archive_file.function_dist.output_sha}", 0, 4)
  bucket = "${google_storage_bucket.functions_store.name}"
  source = "${data.archive_file.function_dist.output_path}"
}

this will print only the first 5 letters of the has string

rileykarson commented 2 years ago

Note that per https://github.com/hashicorp/terraform-plugin-sdk/issues/122#issuecomment-1114912583, keepers appear to be coming to the lifecycle block in 1.2.0, which means they'll work across all resource types rather than just those that have explicitly implemented them.

vibhoragarwal commented 1 year ago

The workaround mentioned above need not serve purpose if I

  1. Need to run re-deploy always when code changes
  2. My deployment bucket has versioning enabled as I need it track all past deployments of the code.
  3. Since when object is deployed, it deletes old object, I cannot have a different dynamic object name as then older copy is deleted and versioning works only on object with same name. So this is what I do to work around this issue -

a. zip the code folder.. data "archive_file" "code_zip" {

b. upload with a fixed name to deployment bucket

resource "google_storage_bucket_object" "code_zip_gcs" { /we always need same file name, so that version can be tracked/ name = local.file_name bucket = google_storage_bucket.deploy_bucket.name # created earlier with versioning enabled source = data.archive_file.code_zip.output_path depends_on = [data.archive_file.code_zip] }

c. upload it again with a dynamic name (overhead to have redundant copy, but this is just a workaround)

resource "google_storage_bucket_object" "code_zip_gcs_latest" { /we always need a new deployment, force it by changing the zip file name, old gets deleted/ name = local.latest_file_name # name is dynamically formulated .

d. deploy the app ( here I am deploying to GAE which has exactly same issue)

/use the code_zip_gcs_latest resource which is the latest deployment appended with say timestamp/ resource "google_app_engine_standard_app_version".. deployment { zip { /since the zip file name (latest_file_name) changes for every run, the deployment is forced every time/ source_url = "https://storage.googleapis.com/${google_storage_bucket.deploy_bucket.name}/${google_storage_bucket_object.code_zip_gcs_latest.name}" } } }

GuilhermeFaga commented 1 year ago

The workaround mentioned above need not serve purpose if I

  1. Need to run re-deploy always when code changes
  2. My deployment bucket has versioning enabled as I need it track all past deployments of the code.
  3. Since when object is deployed, it deletes old object, I cannot have a different dynamic object name as then older copy is deleted and versioning works only on object with same name. So this is what I do to work around this issue -

a. zip the code folder.. data "archive_file" "code_zip" {

b. upload with a fixed name to deployment bucket

resource "google_storage_bucket_object" "code_zip_gcs" { /we always need same file name, so that version can be tracked/ name = local.file_name bucket = google_storage_bucket.deploy_bucket.name # created earlier with versioning enabled source = data.archive_file.code_zip.output_path depends_on = [data.archive_file.code_zip] }

c. upload it again with a dynamic name (overhead to have redundant copy, but this is just a workaround)

resource "google_storage_bucket_object" "code_zip_gcs_latest" { /we always need a new deployment, force it by changing the zip file name, old gets deleted/ name = local.latest_file_name # name is dynamically formulated .

d. deploy the app ( here I am deploying to GAE which has exactly same issue)

/_use the code_zip_gcslatest resource which is the latest deployment appended with say timestamp/ resource "google_app_engine_standard_app_version".. deployment { zip { /_since the zip file name (latest_filename) changes for every run, the deployment is forced every time/ source_url = "https://storage.googleapis.com/${google_storage_bucket.deploy_bucket.name}/${google_storage_bucket_object.code_zip_gcs_latest.name}" } } }

Complementing @vibhoragarwal response:

a. zip the code folder..

data "archive_file" "zip" {
  type        = "zip"
  source_dir  = "${var.root_dir}/src/functions/${var.function_name}"
  output_path = "${var.root_dir}/assets/function-${var.function_name}.zip"
}

b. upload with a fixed name to deployment bucket

resource "google_storage_bucket_object" "source" {
  name   = "functions-${var.function_name}-source.zip"
  bucket = var.artifact_bucket
  source = data.archive_file.zip.output_path
}

c. upload it again with a dynamic name (overhead to have redundant copy, but this is just a workaround)

resource "google_storage_bucket_object" "latest_source" {
  name       = "${google_storage_bucket_object.source.name}-${google_storage_bucket_object.source.crc32c}.zip"
  bucket     = var.artifact_bucket
  source     = data.archive_file.zip.output_path
  depends_on = [google_storage_bucket_object.source]
}

d. deploy the app For Cloud Functions use google_storage_bucket_object.latest_source.output_name on source_archive_object field with source_archive_bucket being the same bucket used in the previous steps:

resource "google_cloudfunctions_function" "function" {
  ...
  source_archive_bucket = var.artifact_bucket
  source_archive_object = google_storage_bucket_object.latest_source.output_name
  ...
}

Worked perfect for me. With the double Cloud Storage upload the Cloud Function isn't deployed every time I run terraform apply, only when it's code had changes.

BluetriX commented 1 year ago

With Terraform 1.2 I use replace_triggered_by as a workaraound

resource "google_storage_bucket_object" "sourcecode" {
  name   = "sourcecode.zip"
  bucket = google_storage_bucket.bucket.name
  source = "${path.module}/sources/sourcecode.zip"
}

resource "google_cloudfunctions_function" "function" {
  <...>

  lifecycle {
    replace_triggered_by  = [
      google_storage_bucket_object.sourcecode
    ]
  }
}

So everytime the sourcecode.zip is uploaded, the function will be replaced.

mysolo commented 1 year ago
labels = {
    deployment-tool = "terraform",
    version-crc32c  = lower(replace(google_storage_bucket_object.source_archive_object.crc32c,"=",""))
 }

The advantage is that we can keep the same archive name on the bucket. As there is no replacement for the function, this also allows you to keep tracking of cloudfunction versions

rayjanoka commented 1 year ago

nice @mysolo! but there are more characters besides = that can be rejected by the label, I already ran into + and /.

this seemed to work a bit better.

version-crc32c = lower(replace(google_storage_bucket_object.blocker.crc32c, "/\\W+/", ""))
rk295 commented 1 year ago

An alternative to the labels solution @mysolo mentioned above is to set a build time environment variable containing the SHA of the zip. The Variable can be named anything, it'll still trigger a re-deploy of the function.

  build_config {
    runtime     = "go120"
    entry_point = "FileReceived"
    environment_variables = {
      # Causes a re-deploy of the function when the source changes
      "SOURCE_SHA" = data.archive_file.src.output_sha
    }
    source {
      storage_source {
        bucket = google_storage_bucket.source_bucket.name
        object = google_storage_bucket_object.src.name
      }
    }
  }
bug-mkr commented 2 months ago

solution @rk295 suggested didn't work for me env variable is updated, however, source code is not