hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.29k stars 1.72k forks source link

Add google_cloud_run_revision and google_cloud_run_split_traffic resources #10095

Open courteouselk opened 2 years ago

courteouselk commented 2 years ago

Community Note

Description

Presently the only way to create Cloud Run revisions via terraform is to use the umbrella google_cloud_run_service resource that would in turn automatically create new(er) revisions and split the traffic accordingly. In this model terraform manages only one - most recent - Cloud Run revision.

However, in the scenario when we want terraform to manage more than one Cloud Run revision (for example, to perform canary deployments of a new container version), there is no way to do that (instead, we should create canary revision outside of terraform, e.g. via gcloud command, - and provide this revision's name to the corresponding traffic block of the base google_cloud_run_service resource).

This issue requests a fully terraform-based approach to the above scenario.

New or Affected Resource(s)

Potential Terraform Configuration

Elaborating on top of the Cloud Run traffic split example from the docs, potential config might look along the lines of:

resource "google_cloud_run_service" "default" {
  name     = "cloudrun-srv"
  location = "us-central1"

  template {
    metadata {
      name = "default"
    }

    spec {
      containers {
        image = "us-docker.pkg.dev/cloudrun/container/hello:latest"
      }
    }
  }
}

resource "google_cloud_run_service_revision" "canary" {
  service = google_cloud_run_service.default.id

  metadata {
    name = "canary"
  }

  spec {
    containers {
      image = "us-docker.pkg.dev/cloudrun/container/hello:canary"
    }
  }
}

resource "google_cloud_run_split_traffic" "default" {
  service = google_cloud_run_service.default.id

  traffic {
    revision_name = "default"
    percent       = 95
  }

  traffic {
    revision_name = "canary"
    percent       = 5
  }
}

References

b/271916488

tclift commented 1 year ago

I like the idea of the revision and traffic resources.

Regarding the service resource, I wish it didn't create revisions at all. Ideally, this would allow us to define base "template" parameters via Terraform, like service name, location, service account, scaling parameters, ports, and common environment variables. Updating this resource would not affect any revisions.

The revision resource would then extend from the template defined in the service, e.g. with a specific image (tag), more environment variables, or other overrides.

In my case I would use Terraform to define the service (without specifying an image), and a separate app deployment pipeline to deploy the revision (with a specific tagged image). So changing the service resource (e.g. changing max instances) should not create/affect existing revisions.

markesha commented 1 year ago

It's actually possible to create and manage multiple revisions under the google_cloud_run_service resource, with a dynamic traffic block e.g.:

  traffic {
      # live serves 100% by default. If canary is enabled, this traffic block controls canary
      percent       = var.canary_enabled ? local.canary_percent : 100
      # revision is named live by default. When canary is enabled, a new revision named canary is deployed
      revision_name = var.canary_enabled ? local.rev_name_canary : local.rev_name_live
    }

    dynamic "traffic" {
      # if canary is enabled, add another traffic block
      for_each = canary_enabled == true ? [canary] : []
      content {
          # current live's traffic is now controlled here
          percent       = var.canary_enabled ? 100 - var.canary_percent : 0
          revision_name = var.canary_enabled ? lovcal.rev_name_live : local.rev_name_canary
      }
    }

See https://medium.com/@vladislavmarkevich/cloudrun-canary-releases-with-terraform-b63245e31a88

tclift commented 1 year ago

Here's my approach for deploying the service from Terraform and revisions externally. I.e., future Terraform applies won't interfere with revisions that were added in between, and creation of revisions (e.g., using gcloud run services update) won't interfere with the service definition.

Amongst other things, this allows for env vars to be defined at the service (affecting all revisions) or at the revision (extending service vars).

resource "google_cloud_run_service" "default" {
  name                       = var.name
  location                   = var.location
  # (Needed for separating Service from Revision)
  # This is effectively an ignore on revision names (`template.metadata.name`). Without it, externally-deployed
  # revisions will cause a failure to apply.
  # Ref issue: https://github.com/hashicorp/terraform-provider-google/issues/5898
  autogenerate_revision_name = true

  template {
    metadata {
      annotations = {
        "autoscaling.knative.dev/minScale" = var.instances.min
        "autoscaling.knative.dev/maxScale" = var.instances.max
      }
    }

    spec {
      container_concurrency = var.request_concurrency
      timeout_seconds       = var.request_timeout_seconds
      service_account_name  = var.service_account_email

      containers {
        # (Needed for separating Service from Revision)
        # The GCP Console "wizard" for creating a service uses this image for the initial revision.
        # Ref issue: https://github.com/hashicorp/terraform-provider-google/issues/10095
        image = "gcr.io/cloudrun/placeholder"

        resources {
          limits = {
            cpu    = var.cpu
            memory = var.memory
          }
        }

        # https://cloud.google.com/run/docs/configuring/http2
        ports {
          name           = "h2c"
          container_port = 8080
        }

        dynamic "env" {
          for_each = var.env

          content {
            name  = env.key
            value = env.value
          }
        }
      }
    }
  }

  # potentially use @markesha's suggestion of a dynamic `traffic` block?
  traffic {
    percent         = 100
    latest_revision = true
  }

  depends_on = [ google_project_service.run ]
  lifecycle {
    # (Needed for separating Service from Revision)
    # The following properties were updated by `gcloud run services update` after this resource had been applied, and
    # therefore need to be ignored to avoid reverting them. YMMV. Test & add your own.
    ignore_changes = [
      template[ 0 ].metadata[ 0 ].annotations[ "client.knative.dev/user-image" ],
      template[ 0 ].metadata[ 0 ].annotations[ "run.googleapis.com/client-name" ],
      template[ 0 ].metadata[ 0 ].annotations[ "run.googleapis.com/client-version" ],
      template[ 0 ].metadata[ 0 ].labels,
      template[ 0 ].spec[ 0 ].containers[ 0 ].image
    ]
  }
}