hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.32k stars 1.73k forks source link

Cannot set Autoscaling Algorithm to `THROUGHPUT_BASED` #17570

Open Demacr opened 6 months ago

Demacr commented 6 months ago

Community Note

Terraform Version

Terraform v1.6.3
on darwin_amd64
+ provider registry.terraform.io/carlpett/sops v0.7.2
+ provider registry.terraform.io/cyrilgdn/postgresql v1.18.0
+ provider registry.terraform.io/hashicorp/external v2.3.3
+ provider registry.terraform.io/hashicorp/google v5.16.0
+ provider registry.terraform.io/hashicorp/google-beta v5.20.0
+ provider registry.terraform.io/hashicorp/helm v2.12.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.25.2
+ provider registry.terraform.io/hashicorp/null v3.2.2
+ provider registry.terraform.io/hashicorp/random v3.6.0
+ provider registry.terraform.io/hashicorp/tls v4.0.5
+ provider registry.terraform.io/vancluever/acme v2.20.2

Your version of Terraform is out of date! The latest version
is 1.7.4. You can update by downloading from https://www.terraform.io/downloads.html

Affected Resource(s)

google_dataflow_flex_template_job

Terraform Configuration

resource "google_dataflow_flex_template_job" "main" {
  provider = google-beta

  name                    = "XXX"
  container_spec_gcs_path = "gs://dataflow-templates/latest/flex/PubSub_to_BigQuery_Flex"

  network    = "XXX"
  subnetwork = "regions/us-east1/subnetworks/XXX"
  region     = "us-east1"

  on_delete                    = "drain"
  skip_wait_on_job_termination = true

  # autoscaling_algorithm = 
  max_workers           = 3
  num_workers           = 1
  launcher_machine_type = "c2d-highmem-2"
  machine_type          = "c2d-highmem-2"
  service_account_email = "XXXXXXX"

  parameters = {
    autoscalingAlgorithm                         = "THROUGHPUT_BASED"
    diskSizeGb                                   = "30"
    inputSubscription                            = google_pubsub_subscription.source.id
    javascriptTextTransformFunctionName          = "transform"
    javascriptTextTransformGcsPath               = "gs://XXXXXX"
    javascriptTextTransformReloadIntervalMinutes = "15"
    numberOfWorkerHarnessThreads                 = "2"
    numStorageWriteApiStreams                    = "0"
    numStorageWriteApiStreams                    = "8"
    outputTableSpec                              = "XXXX"
    storageWriteApiTriggeringFrequencySec        = "15"
    useStorageWriteApi                           = "false"
    useStorageWriteApi                           = "true"
    useStorageWriteApiAtLeastOnce                = "false"
  }
}

Debug Output

https://gist.github.com/Demacr/a6b30bb83f0105ba0764571d75b44ace

Expected Behavior

Created new job with autoscaling algorithm equal to THROUGHPUT_BASED

Actual Behavior

It throws error of incompatible error:

β”‚ Error: googleapi: Error 400: Invalid value at 'launch_parameter.environment.autoscaling_algorithm' (type.googleapis.com/google.dataflow.v1beta3.AutoscalingAlgorithm), "THROUGHPUT_BASED"
β”‚ Details:
β”‚ [
β”‚   {
β”‚     "@type": "type.googleapis.com/google.rpc.BadRequest",
β”‚     "fieldViolations": [
β”‚       {
β”‚         "description": "Invalid value at 'launch_parameter.environment.autoscaling_algorithm' (type.googleapis.com/google.dataflow.v1beta3.AutoscalingAlgorithm), \"THROUGHPUT_BASED\"",
β”‚         "field": "launch_parameter.environment.autoscaling_algorithm"
β”‚       }
β”‚     ]
β”‚   }
β”‚ ]

Steps to reproduce

  1. terraform apply

Important Factoids

I found that it accepts only values which noticed by the link: https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs#Job.AutoscalingAlgorithm

I used AUTOSCALING_ALGORITHM_BASIC value and then it accepts the value and creates job, but it doesn't appear autoscalingAlgorithm record in the job parameters.

References

No response

b/329834219

ggtisc commented 6 months ago

The issue was confirmed after the replication with the error message Error 400: Invalid value at 'launch_parameter.environment.autoscaling_algorithm' (type.googleapis.com/google.dataflow.v1beta3.AutoscalingAlgorithm), "THROUGHPUT_BASED"

damondouglas commented 6 months ago

Good day @Demacr. Thank you for raising this issue. I noticed that

autoscalingAlgorithm = "THROUGHPUT_BASED"

was placed within the parameters block in the google_dataflow_flex_template_job. The paramters argument should be used for template specific parameters. I've created #17612 to improve the documentation on this point.

We have an auto-generated example of the PubSub_to_BigQuery_Flex template referenced in the terraform code above. You may find it here: v2/googlecloud-to-googlecloud/terraform/PubSub_to_BigQuery_Flex/dataflow_job.tf.

Please let us know if you need further help.

Demacr commented 6 months ago

Hi @damondouglas , Oh, I tried this way also, you could see commented autoscaling param out of parameters block in my original bug report message. It shows the same error:

β”‚ Error: googleapi: Error 400: Invalid value at 'launch_parameter.environment.autoscaling_algorithm' (type.googleapis.com/google.dataflow.v1beta3.AutoscalingAlgorithm), "THROUGHPUT_BASED"
β”‚ Details:
β”‚ [
β”‚   {
β”‚     "@type": "type.googleapis.com/google.rpc.BadRequest",
β”‚     "fieldViolations": [
β”‚       {
β”‚         "description": "Invalid value at 'launch_parameter.environment.autoscaling_algorithm' (type.googleapis.com/google.dataflow.v1beta3.AutoscalingAlgorithm), \"THROUGHPUT_BASED\"",
β”‚         "field": "launch_parameter.environment.autoscaling_algorithm"
β”‚       }
β”‚     ]
β”‚   }
β”‚ ]

I've tried this right now with 5.20 module version.

damondouglas commented 6 months ago

Good day, @Demacr. The terraform resource follows projects.jobs#Job.AutoscalingAlgorithm. I believe this is the behavior of terraform resources for the Google Provider. Generally, when I run into issues I look at the API reference to troubleshoot. The other clue that this may have been an incorrect input is from the 400 code of the API error that tells me that THROUGHPUT_BASED was an "Invalid value".

melinath commented 6 months ago

@Demacr is there documentation somewhere that indicated that the API should support this value? Or are you wanting to request that the API should add this as an additional algorithm?

Demacr commented 6 months ago

@melinath From this documentation. I used this values for autoscaling when created manually, and it worked.

melinath commented 6 months ago

@Demacr when you say "created manually", do you mean that you were manually able to make an API call that accepted THROUGHPUT_BASED as an argument, or that you were able to locally spin up a pipeline following that document?

Demacr commented 6 months ago

@melinath I mean originally created the flex dataflow by gcloud cli command and then "terraform" that command.

Demacr commented 6 months ago

For example real command with anonymized values:

gcloud dataflow flex-template run xxx \
  --template-file-gcs-location gs://dataflow-templates-us-east1/latest/flex/PubSub_to_BigQuery_Flex \
  --region us-east1 \
  --worker-region us-east1 \
  --subnetwork regions/us-east1/subnetworks/xxx \
  --network xxx \
  --additional-user-labels "" \
  --parameters outputTableSpec=xxx,inputSubscription=xxx,useStorageWriteApiAtLeastOnce=false,javascriptTextTransformGcsPath=gs://xxx/transform_func.js,javascriptTextTransformFunctionName=transform,javascriptTextTransformReloadIntervalMinutes=15,serviceAccount=xxx,maxNumWorkers=3,numberOfWorkerHarnessThreads=2,diskSizeGb=30,workerMachineType=c2d-highmem-2,useStorageWriteApi=true,numStorageWriteApiStreams=8,storageWriteApiTriggeringFrequencySec=15,autoscalingAlgorithm=THROUGHPUT_BASED \
  --project=xxx
melinath commented 6 months ago

@Demacr gcloud should send that to the API, so if it works in gcloud it should be possible to do in Terraform as well. If you add --log-http to the gcloud command, can you see what API field it uses for THROUGHPUT_BASED in the API request?

Demacr commented 6 months ago

@melinath

=======================
==== request start ====
uri: https://dataflow.googleapis.com/v1b3/projects/xxx/locations/us-east1/flexTemplates:launch?alt=json
method: POST
== headers start ==
b'accept': b'application/json'
b'accept-encoding': b'gzip, deflate'
b'authorization': --- Token Redacted ---
b'content-length': b'1113'
b'content-type': b'application/json'
b'user-agent': b'google-cloud-sdk gcloud/465.0.0 command/gcloud.dataflow.flex-template.run invocation-id/xxx environment/None environment-version/None client-os/MACOSX client-os-ver/23.4.0 client-pltf-arch/x86_64 interactive/True from-script/False python/3.12.2 term/xterm-256color (Macintosh; Intel Mac OS X 23.4.0)'
b'x-goog-api-client': b'cred-type/u'
== headers end ==
== body start ==
{"launchParameter": {"containerSpecGcsPath": "gs://dataflow-templates-us-east1/latest/flex/PubSub_to_BigQuery_Flex", "environment": {"enableStreamingEngine": false, "network": "xxx", "subnetwork": "regions/us-east1/subnetworks/xxx", "workerRegion": "us-east1"}, "jobName": "xxx", "parameters": {"autoscalingAlgorithm": "THROUGHPUT_BASED", "diskSizeGb": "30", "inputSubscription": "xxx", "javascriptTextTransformFunctionName": "transform", "javascriptTextTransformGcsPath": "gs://xxx/transform_func.js", "javascriptTextTransformReloadIntervalMinutes": "15", "maxNumWorkers": "3", "numStorageWriteApiStreams": "8", "numberOfWorkerHarnessThreads": "2", "outputTableSpec": "xxx", "serviceAccount": "xxx", "storageWriteApiTriggeringFrequencySec": "15", "useStorageWriteApi": "true", "useStorageWriteApiAtLeastOnce": "false", "workerMachineType": "c2d-highmem-2"}}}
== body end ==
==== request end ====
---- response start ----
status: 200
-- headers start --
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Cache-Control: private
Content-Encoding: gzip
Content-Type: application/json; charset=UTF-8
Date: Tue, 02 Apr 2024 20:47:53 GMT
Server: ESF
Transfer-Encoding: chunked
Vary: Origin, X-Origin, Referer
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 0
-- headers end --
-- body start --
{
  "job": {
    "id": "2024-04-02_13_47_52-10863458629911449595",
    "projectId": "xxx",
    "name": "xxx",
    "currentStateTime": "1970-01-01T00:00:00Z",
    "createTime": "2024-04-02T20:47:53.134279Z",
    "location": "us-east1",
    "startTime": "2024-04-02T20:47:53.134279Z"
  }
}

-- body end --
total round trip time (request+response): 1.714 secs
---- response end ----
----------------------
melinath commented 5 months ago

Thanks for the logs, that's super helpful!

This looks like a valid issue to me - you're able to use gcloud to get THROUGHPUT_BASED autoscaling, but can't do it with Terraform (even though both use the API).

Specifically, gcloud sets launchParameter.parameters.autoscalingAlgorithm, while terraform explicitly extracts autoscalingAlgorithm from parameters and sets it at launchParameter.environment.autoscalingAlgorithm. That's not the only parameter treated this way, but perhaps it's the most impactful because it's treated differently by the API?

This behavior was introduced in 5.0.0 via https://github.com/GoogleCloudPlatform/magic-modules/pull/9031; it looks like we believed at the time that the environment and parameters fields should contain the same values?

In the long term the "fix" would probably be to introduce a separate environment field on the resource so that users can explicitly set parameters and environment separately - but I don't know what the API behavior is or what the long-term plans for the API are so I don't know if that would make sense to do at this point. Separating the fields more explicitly would need to be behind a guard if introduced in a minor version since it's a major behavioral change.