GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.12k stars 936 forks source link

[Bug]: Kafka to BigQuery template not working for JSON #1638

Closed mads-gdl closed 1 month ago

mads-gdl commented 1 month ago

Related Template(s)

Kafka to BigQuery

Template Version

2024-06-04-00_RC01

What happened?

I'm trying to create the dataflow job Kafka to BigQuery from the template in the GCP console. When selecting JSON as the as the Kafka Message Format and attempting to Run the job a pop-up message appear with the below message.

Job creation failed
The template parameters are invalid. Details: schemaFormat: Missing required parameter

When using JSON as the message format there is no option to select a schemaFormat.

Relevant log output

No response

githubwua commented 1 month ago

I was trying to find out what the schemaFormat parameter accepts, and the error message provides the following details:

{
  "error": {
    "code": 400,
    "message": "The template parameters are invalid. Details: \nschemaFormat: Parameter didn't match regex '^(SCHEMA_REGISTRY|SINGLE_SCHEMA_FILE)$'",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.dataflow.v1beta3.InvalidTemplateParameters",
        "parameterViolations": [
          {
            "parameter": "schemaFormat",
            "description": "Parameter didn't match regex '^(SCHEMA_REGISTRY|SINGLE_SCHEMA_FILE)$'"
          }
        ]
      }
    ]
  }
}

As a quick workaround, we can specify "SINGLE_SCHEMA_FILE" for the schemaFormat parameter.

It is true that there is no way to specify this in the GCP Console UI, so this definitely needs a fix.

mads-gdl commented 1 month ago

Yes, can confirm by explicitly setting the parameter schemaFormat=SINGLE_SCHEMA_FILE and executing the template from cloud shell it appears to work.

githubwua commented 1 month ago

Happy to hear that the workaround is working for you, too. I just realized there is a fix for this issue, and I expect we will no longer need this workaround when the fix is applied.