Snowflake-Labs / terraform-provider-snowflake

Terraform provider for managing Snowflake accounts
https://registry.terraform.io/providers/Snowflake-Labs/snowflake/latest
MIT License
534 stars 413 forks source link

snowflake_stage file_format adds parenthesis around file format name and does not compile #1911

Open axthosarouris opened 1 year ago

axthosarouris commented 1 year ago

Provider Version

0.67.0

Terraform Version

1.4.1

Describe the bug

The generated SQL query of the following resource is not syntactically correct.

resource "snowflake_file_format" "json_gziped" {
  name        = "JSON_GZIP"
  database    = data.snowflake_database.database.name
  schema      = snowflake_schema.schema.name
  format_type = "JSON"
  compression = "GZIP"
}

resource "snowflake_stage" "staging_buffer" {
  depends_on = [
    snowflake_storage_integration.storage_integration,
    snowflake_file_format.json_gziped
  ]
  name                = "MY_SNOWPIPE_STAGE"
  url                 = "${var.S3_BUCKET}/ENTITY_UPDATE"
  database            = var.DATABASE
  schema              = snowflake_schema.schema.name
  storage_integration = snowflake_storage_integration.storage_integration.name
  file_format         = snowflake_file_format.json_gziped.name
  copy_options = "ABORT_STATEMENT"
}

The generated query is the following:

CREATE STAGE "TEST"."CROSSAD"."CROSSAD_SNOWPIPE_STAGE" 
URL = 's3://datalake-main/bucket/EntityUpdate STORAGE_INTEGRATION = "S3_INTEGRATION" 
FILE_FORMAT = (JSON_GZIP) 
COPY_OPTIONS = (ABORT_STATEMENT)

Expected behavior Do not add parenthesis around the file_format name. snowflake complains. Example:

CREATE STAGE "TEST"."CROSSAD"."CROSSAD_SNOWPIPE_STAGE" 
URL = 's3://datalake-main/bucket/EntityUpdate STORAGE_INTEGRATION = "S3_INTEGRATION" 
FILE_FORMAT = JSON_GZIP 
COPY_OPTIONS = (ABORT_STATEMENT)

Code samples and commands

See above

Additional context

Add any other context about the problem here.

sfc-gh-swinkler commented 1 year ago

Hello @axthosarouris. I believe this was fixed as part of the 0.68 release, as we have rewritten the SQL generator for file formats at that time. Can you please verify?

douglaszickuhr commented 1 year ago

Hey @axthosarouris I had the same problem and the workaround was to add the prefix FORMAT_NAME = to snowflake_file_format.json_gziped.name as you have there. Basically you need:

resource "snowflake_stage" "staging_buffer" {
  depends_on = [
    snowflake_storage_integration.storage_integration,
    snowflake_file_format.json_gziped
  ]
  name                = "MY_SNOWPIPE_STAGE"
  url                 = "${var.S3_BUCKET}/ENTITY_UPDATE"
  database            = var.DATABASE
  schema              = snowflake_schema.schema.name
  storage_integration = snowflake_storage_integration.storage_integration.name
  file_format         = "FORMAT_NAME = ${snowflake_file_format.json_gziped.name}"
  copy_options = "ABORT_STATEMENT"
}

Keep in mind that if you're using a default file format, the prefix has to be TYPE = instead. :)

axthosarouris commented 1 year ago

Hi @sfc-gh-swinkler, in version 0.70.0 it has not been fixed. the following code results in the following sql query:


resource "snowflake_stage" "staging_buffer" {
  depends_on = [
    snowflake_storage_integration.storage_integration,
    snowflake_file_format.json_gziped
  ]
  name                = local.staging_buffer_name
  url                 = local.s3_source_location
  database            = data.snowflake_database.database.name
  schema              = snowflake_schema.schema.name
  storage_integration = snowflake_storage_integration.storage_integration.name
  file_format         = snowflake_file_format.json_gziped.name
  copy_options = "ON_ERROR=ABORT_STATEMENT,MATCH_BY_COLUMN_NAME=CASE_INSENSITIVE"
}

resource snowflake_grant_privileges_to_role "staging_buffer_grant"{
  depends_on = [snowflake_stage.staging_buffer]
  privileges =["USAGE"]
  role_name = snowflake_role.role.name
  on_schema_object {
    object_type ="STAGE"
    object_name = "${data.snowflake_database.database.name}.${snowflake_schema.schema.name}.${snowflake_stage.staging_buffer.name}"
  }
}
sql-conn-exec [query CREATE STAGE "TEST"."TABLE_NAME"."ALL_ENTITIES_STAGE" URL = 's3://staging-bucket-main/' STORAGE_INTEGRATION = "INTEGRATION" FILE_FORMAT = (JSON_GZIP) COPY_OPTIONS = (ON_ERROR=ABORT_STATEMENT,MATCH_BY_COLUMN_NAME=CASE_INSENSITIVE)
sfc-gh-swinkler commented 1 year ago

@axthosarouris

The file_format attribute for the snowflake_stage resource is a bit odd. It expects the string that you would pass in as if it were raw SQL. https://docs.snowflake.com/en/sql-reference/sql/create-stage#optional-parameters

[ FILE_FORMAT = ( { FORMAT_NAME = '<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] } ) ]

So in this case, you should not be passing in the file_format_name directly. Instead, it should be as @douglaszickuhr has said.

resource "snowflake_stage" "staging_buffer" {
...
file_format         = "FORMAT_NAME = ${snowflake_file_format.json_gziped.name}"

We have not made any changes to this resource in 0.70, although it is on our roadmap. This may take a while, however, as any breaking changes, like using nested objects instead of raw SQL to configure file format, will not be released until 1.0 of the provider.

chriselion commented 10 months ago

This threw me off too. I think your example

file_format         = "FORMAT_NAME = ${snowflake_file_format.json_gziped.name}"

might not be quite right either, since you need the qualified name.

mganopolsky commented 6 months ago

Facing a similar situation. While file_format = "TYPE = PARQUET", attempting to use file_format = "FORMAT_NAME = ${snowflake_file_format.my_format_name.name}" wouldn't work for the life of me. I wonder if PARQUET has a specific issue?