hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.28k stars 1.72k forks source link

google_dataflow_job, specifying region and zone gives 400 from API #6458

Open giimsland opened 4 years ago

giimsland commented 4 years ago

Community Note

Terraform Version

Terraform v0.12.25

Affected Resource(s)

Terraform Configuration Files

  name = "hps2b"
  template_gcs_path = local.dataflow_template
  temp_gcs_location = "${google_storage_bucket.bucket.url}/temp"
  on_delete = "drain"
  project = var.project_id
  region = "europe-west3"
  zone = "europe-west3-a"
  service_account_email = google_service_account.sa.email
  parameters = {
    inputTopic = local.input_topic
    outputDirectory = "${google_storage_bucket.bucket.url}/output"
    outputFilenamePrefix = "output-"
    outputFilenameSuffix = ".txt"
  }
}

Expected Behavior

Dataflow job created, with region/zone settings

Actual Behavior

400 error from googleapi: Error: googleapi: Error 400: The template parameters are invalid., badRequest

Steps to Reproduce

  1. Create dataflow_job template with region and zone set.
  2. terraform apply

b/306863723

venkykuberan commented 4 years ago

region is not an attribute for google_data_flow resource. Can you attach the plan output ?

venkykuberan commented 4 years ago

@giimsland looks like it's a plan time validation. You cannot use both, its mutually exclusive.

giimsland commented 4 years ago

Hi, thanks for looking into it. I can provide logs tomorrow.

According to https://registry.terraform.io/modules/terraform-google-modules/dataflow/google/1.0.0, region is a valid parameter.

However, if you try only zone as parameter, it will still fail.

I believe zone is no longer valid in Googles API for dataflow, as it will automatically select the best zone within chosen region.

giimsland commented 4 years ago

With only zone provided (europe-west3-a), terraform plan shows that the resource will be created with the correct settings.

Still, the response is: 2020/05/27 09:59:32 [ERROR] module.app: eval: terraform.EvalApplyPost, err: googleapi: Error 400: The template parameters are invalid., badRequest 2020/05/27 09:59:32 [ERROR] module.app: eval: terraform.EvalSequence, err: googleapi: Error 400: The template parameters are invalid., badRequest Error: googleapi: Error 400: The template parameters are invalid., badRequest

venkykuberan commented 4 years ago

Looks like you are getting template parameters error not region/zone issue.

Following sample config worked fine for me

resource "google_dataflow_job" "big_data_job" {
  name              = "dataflow-job"
  template_gcs_path = "gs://dataflow-templates/latest/Word_Count"
  temp_gcs_location = "gs://xxx-bucket/tmp"
  parameters = {
    inputFile = "gs://xxx-bucket/test.txt"
    output = "gs://xxx-bucket/output"
  }
  zone = "europe-west3-a" // var.zone
  machine_type = "n1-standard-1"
  max_workers = "0"
  labels = {
    pipeline = "trackit"
  }
}
giimsland commented 4 years ago

Hi, What version are you using? I tried yours, and its not working:

2020/05/28 13:38:53 [WARN] Provider "registry.terraform.io/-/google" produced an invalid plan for module.app.google_dataflow_job.big_data_job, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .on_delete: planned value cty.StringVal("drain") does not match config value cty.NullVal(cty.String)
module.app.google_dataflow_job.big_data_job: Creating...
2020/05/28 13:38:53 [ERROR] module.app: eval: *terraform.EvalApplyPost, err: project: required field is not set
2020/05/28 13:38:53 [ERROR] module.app: eval: *terraform.EvalSequence, err: project: required field is not set

Error: project: required field is not set

  on ../module/main.tf line 77, in resource "google_dataflow_job" "big_data_job":
  77: resource "google_dataflow_job" "big_data_job" {
venkykuberan commented 4 years ago

same version 3.22 as yours. I am sending the project through environment variable. You can try adding project = <your_project_id in the config.

giimsland commented 4 years ago

Yes, that worked. However my original config is not working, still. There probably is parameter issues, but I can not figure out which parameter it is. Terraform accepts the config. Config:

resource "google_dataflow_job" "ps2b" {
  name = "dataflow-job"
  template_gcs_path = "gs://dataflow-templates/latest/Cloud_PubSub_to_Avro"
  temp_gcs_location = "gs://xxx-bucket/temp"
  on_delete = "drain"
  project = var.project_id
  zone = "europe-west3-a"
  service_account_email = google_service_account.sa.email
  parameters = {
    inputTopic = "someTopic"
    outputDirectory = "gs://xxx-bucket/output"
    outputFilenamePrefix = "output-"
    outputFilenameSuffix = ".txt"
  }
}

Output from terraform plan:

  + resource "google_dataflow_job" "ps2b" {
      + id                    = (known after apply)
      + job_id                = (known after apply)
      + name                  = "dataflow_job"
      + on_delete             = "drain"
      + parameters            = {
          + "inputTopic"           = "someTopic"
          + "outputDirectory"      = "gs://xxx_bucket/output"
          + "outputFilenamePrefix" = "output-"
          + "outputFilenameSuffix" = ".txt"
        }
      + project               = "myProject"
      + service_account_email = "someServiceUser"
      + state                 = (known after apply)
      + temp_gcs_location     = "gs://xxx_bucket/temp"
      + template_gcs_path     = "gs://dataflow-templates/latest/Cloud_PubSub_to_Avro"
      + type                  = (known after apply)
      + zone                  = "europe-west3-a"
    }

Output from terraform apply:

Error: googleapi: Error 400: The template parameters are invalid., badRequest
venkykuberan commented 4 years ago

Can you attach your debug log here ..

voycey commented 2 years ago

Did this ever get solved @giimsland? I am having the same problem and just cant see what the actual errors are, plan works fine but I get the same 400 error.

giimsland commented 2 years ago

No @voycey. We use the gcloud cli for these kind of things now.

voycey commented 2 years ago

I sorted it for anyone else running into this problem in the future, the following worked for me (its really bad documentation around what the parameters are supposed to be was the main cause), I used the UI to understand what parameters were required and then checked against the code in the Java files for Beam to ensure that it didnt require any others. Key things were that inputTopic / inputSubscription requires an ID rather than a name, the table spec needs to include the project name and the machine / node types should be included:

resource "google_dataflow_job" "pubsub_to_bq" {
  name              = "${var.app_name}-${var.stage}"
  template_gcs_path = "gs://dataflow-templates-australia-southeast1/latest/PubSub_to_BigQuery"
  temp_gcs_location = "gs://tymlez-${var.app_name}-${var.stage}-temp"
  enable_streaming_engine = true
  project           = var.gcp_project_id

  parameters = {
      inputTopic            = google_pubsub_topic.data.id
      outputTableSpec       = "${var.gcp_project_id}:${var.app_name}_${var.stage}.data"
      outputDeadletterTable = "${var.gcp_project_id}:${var.app_name}_${var.stage}.data"
  }
  on_delete = "cancel"

  zone = "australia-southeast1-a" // var.zone
  machine_type = "n1-standard-1"
  max_workers = "0"

  depends_on = [
    google_project_service.dataflow-service,
    google_project_service.pubsub-service,
    google_pubsub_subscription.data-sub,
    google_storage_bucket.main_bucket,
    google_storage_bucket.temp_bucket,
  ]

}