Snowflake-Labs / terraform-provider-snowflake

Terraform provider for managing Snowflake accounts
https://registry.terraform.io/providers/Snowflake-Labs/snowflake/latest
MIT License
549 stars 420 forks source link

snowflake_file_format update-in-place doesn't apply #2385

Closed alexmaras closed 8 months ago

alexmaras commented 9 months ago

Terraform CLI and Provider Versions

Terraform v1.6.5
on linux_amd64
+ provider registry.terraform.io/snowflake-labs/snowflake v0.82.0

Terraform Configuration

resource "snowflake_file_format" "csv" {
    binary_format                  = "HEX"
    compression                    = "AUTO"
    database                       = "RAW"
    date_format                    = "AUTO"
    empty_field_as_null            = true
    encoding                       = "UTF8"
    error_on_column_count_mismatch = true
    escape                         = "NONE"
    escape_unenclosed_field        = "NONE"
    field_delimiter                = ","
    field_optionally_enclosed_by   = "\""
    format_type                    = "CSV"
    id                             = "RAW|EXAMPLE_SCHEMA|PROCESS_CSV"
    name                           = "PROCESS_CSV"
    null_if                        = [
        "NULL",
    ]
    parse_header                   = false
    record_delimiter               = <<-EOT

    EOT
    replace_invalid_characters     = false
    schema                         = "EXAMPLE_SCHEMA"
    skip_blank_lines               = false
    skip_byte_order_mark           = true
    skip_header                    = 1
    time_format                    = "AUTO"
    timestamp_format               = "AUTO"
    trim_space                     = false
}

Expected Behavior

Updating a field in the file_format resource and re-applying updates the file format in snowflake

Actual Behavior

The update does not change anything in snowflake, and terraform picks up the change as needing to be applied every time

Steps to Reproduce

  1. Create a snowflake_file_format resource
  2. terraform apply
  3. Update the resource, change the field_optionally_enclosed_by field for instance (not specific to this field) to NONE
  4. terraform apply

How much impact is this issue causing?

Medium

Logs

No response

Additional Information

Workaround at the moment is to taint the resource in terraform and allow it to be destroyed and recreated.

sfc-gh-asawicki commented 9 months ago

Hey @alexmaras. Thanks for reporting the issue.

Could you provide the debug logs (TF_LOG=DEBUG) showing that the update is not taking place?

justin-ramirez-gametime commented 9 months ago

We are having a similar issue: Terraform v1.7.0 on darwin_arm64

These are included on the plan every time: Screenshot 2024-01-22 at 1 45 46 PM

However the state does not show any change after this plan is applied: image image

To add to the complexity of this issue, this problem only persists in our staging and testing workspaces but not in Production. We speculated that this was because we imported these in production and the values were already set to true.

alexmaras commented 9 months ago

hi @sfc-gh-asawicki - thanks for the response. @justin-ramirez-gametime - the problem seems identical here.

Attached are the logs from two applies that happened directly one after the other, both reporting success, with no change in snowflake.

tf-output.txt.gz tf-output-second.txt.gz

Here is the output from terrafrom apply too.

run-1.txt run-2.txt

sfc-gh-asawicki commented 9 months ago

Thanks for the logs @alexmaras, @justin-ramirez-gametime. I will analyze this issue this week.

raulbonet commented 9 months ago

Hello,

We had the same problem; I think I found the issue; created an MR with the fix.

justin-ramirez-gametime commented 9 months ago

Any news on this?

sfc-gh-asawicki commented 9 months ago

I have just merged the change @raulbonet mentioned in his previous comment. We plan to release the new version in the middle of the next week (Wednesday or Thursday).

sfc-gh-asawicki commented 8 months ago

@alexmaras @justin-ramirez-gametime the aforementioned PR was released as part of 0.86.0. Could you please check if this solves the issue for you?

benriou commented 8 months ago

hello

I'm still facing the same issue in 0.86.0 (tested today) where escape_unenclosed_field is always planned for update-in-place with no effect.

raulbonet commented 8 months ago

Hello @benriou Can you please post the terraform configuration and the plan so I can debug?

benriou commented 8 months ago

resource "snowflake_file_format" "loadbalancer_logs_XXX_dev" {
  name                = "LOADBALANCER_LOGS"
  database            = "DEV"
  schema              = "PUBLIC"
  format_type         = "CSV"
  binary_format       = "HEX"
  date_format         = "AUTO"
  time_format         = "AUTO"
  timestamp_format    = "AUTO"
  empty_field_as_null = true

  error_on_column_count_mismatch = true

  escape                  = "NONE"
  escape_unenclosed_field = "\\\\"
  encoding                = "UTF8"

  field_optionally_enclosed_by = "\""
  record_delimiter             = "\n"
  field_delimiter              = " "
  skip_header                  = 0
  skip_blank_lines             = true
  skip_byte_order_mark         = true
  null_if                      = ["\\\\N"]

  compression = "AUTO"
}
  # snowflake_file_format.loadbalancer_logs_XXX_dev will be updated in-place
  ~ resource "snowflake_file_format" "loadbalancer_logs_XXX_dev" {
      ~ escape_unenclosed_field        = "\\" -> "\\\\"
        id                             = "DEV|PUBLIC|LOADBALANCER_LOGS"
        name                           = "LOADBALANCER_LOGS"
        # (32 unchanged attributes hidden)
    }
benriou commented 8 months ago

Please let me know if you need further details @raulbonet ; thank you

raulbonet commented 8 months ago

Thanks, I will have a look later today

raulbonet commented 8 months ago

Hello @benriou I had a look. I think that the problem is just that \\\\ is not a valid value for this field, but Snowflake instead of raising an exception, does nothing. If you read the documentation here , section ESCAPE_UNENCLOSED_FIELD:

_A singlebyte character string used as the escape character for unenclosed field values only. An escape character invokes an alternative interpretation on subsequent characters in a character sequence. You can use the ESCAPE character to interpret instances of the FIELD_DELIMITER or RECORDDELIMITER characters in the data as literals. The escape character can also be used to escape instances of itself in the data.

Accepts common escape sequences, octal values, or hex values.

But \\\\ cannot be expressed as a single byte sequence, nor is a "common escape sequence".

If you manually try to execute the statement in Snowflake GUI: ALTER FILE FORMAT "DEV"."PUBLIC"."LOADBALANCER_LOGS" SET ESCAPE_UNENCLOSED_FIELD = '\\\\'

And then query the file format: SHOW FILE FORMATS LIKE 'LOADBALANCER_LOGS' IN SCHEMA "DEV"."PUBLIC";

You will see that, indeed, the character has not changed.

I tried with other characters, like "A" or "a" and in this case, it gets replaced by \n, and still not raising an exception.

benriou commented 8 months ago

thank you @raulbonet ;

I thought I had to escape the backslashes.

This is working like a charm since I configured the field to \\ ; the change is correctly propagated to the FILE_FORMAT.

raulbonet commented 8 months ago

Glad to know it worked! :)

sfc-gh-asawicki commented 8 months ago

@alexmaras @justin-ramirez-gametime could you also confirm that it solved your issue, please?

alexmaras commented 8 months ago

thanks @sfc-gh-asawicki - I'll check it out tomorrow and report back.

alexmaras commented 8 months ago

Hey @sfc-gh-asawicki - The issue I'm hitting now is similar to the above, but I have no way of solving it.

I have some file formats defined in Snowflake that have a character as their optional enclosing value - in this case double quotes ("). I need to change those file formats to have no optional enclosing value - i.e. change it to NONE. If I use NONE, I now get an error - so this is an improvement - that says:

│ Error: CSVFieldOptionallyEnclosedBy must be one of [null ' "]
│ 
│   with module.ingest_seer_data["study_label_groups"].snowflake_file_format.csv[0],
│   on modules/snowpipe-integration/file_format.tf line 31, in resource "snowflake_file_format" "csv":
│   31: resource "snowflake_file_format" "csv" {
│ 

However, if I use null, terraform just doesn't report any changes at all. The file format does need updating, but it refuses to plan it because it's using the defaults.

raulbonet commented 8 months ago

Thanks @alexmaras , I will have a look tomorrow (EMEA timezone)

alexmaras commented 8 months ago

Thanks @raulbonet - I imagine NONE will have to be added as a special option to prevent terraform from swallowing the change. I'll also have a poke in the go code and see if I can get a handle on where it happens.

This is the error I'm hitting: https://github.com/Snowflake-Labs/terraform-provider-snowflake/blob/5544544dff8c25c83f23f7759341992a9b282fda/pkg/sdk/file_format.go#L575

So I guess at the very least, NONE would have to be added to that list, but I don't know how that'll pass through to snowflake. I'll see if I can build and test a change like that in the meantime, but I imagine I might hit an issue with it being wrapped in quotes or similar.

As far as I understand it, any single character (or the keyword NONE) should be valid, so this might just be something worth making less strict in general.

raulbonet commented 8 months ago

Hello @alexmaras I had a look, and indeed you are right: the validate() should account for the NONE use-case. I am not sure if the intention of the original developer was to use null as NONE or indeed check for null, but the latter is not needed, since the validate function already has the condition valueSet(opts.CSVFieldOptionallyEnclosedBy).

Anyway, I will create a PR with this small change.

raulbonet commented 8 months ago

@sfc-gh-asawicki , I notice a couple of issues here, just bringing it to your attention:

  1. NONE is a Snowflake reserved keyword that afaik, currently the SQLBuilder does not support. So, the statement executed by Snowflake is:

ALTER ...... = 'NONE', when it should be ALTER ...... = NONE

It seems that both seem to work in this case, but not sure if this is the case for absolutely all resources in Snowflake.

  1. I am noticing a non-idempotent operation here: if the resource is being created, Snowflake will default to whatever default value it has. However, if the resource is being altered, Snowflake will just keep the existing value. The only way to make this idempotent would be recreate the resource when a value is set to null, so it defaults to Snowflake's default value.

I guess we do not want to address this in this PR, but this probably affects all resources. I wonder if we want to account for this use-case or at least track it in the backlog.

sfc-gh-asawicki commented 8 months ago

@raulbonet Thanks, I will take a look tomorrow.

sfc-gh-asawicki commented 8 months ago

Hey @raulbonet. My answers below:

ad 1 I will note it down, we should run it with NONE, not 'NONE'.

ad 2 This is one of the topics we are aware of and that we will address during resources redesign (https://github.com/Snowflake-Labs/terraform-provider-snowflake/blob/main/ROADMAP.md#our-roadmap). One of the PRs that shows how we can handle this is #2502. We are of course limited only to objects where we can easily access the default snowflake value. For the cases where this is not possible, we will just encourage the user to fill in the value (or even force the user to do it by requiring it).

alexmaras commented 8 months ago

Thanks @raulbonet for helping!

0.87.1 has solved the issue, and I can apply file format updates successfully, including "NONE"!

image

Resolved by https://github.com/Snowflake-Labs/terraform-provider-snowflake/pull/2575 and released in 0.87.1

justin-ramirez-gametime commented 8 months ago

@sfc-gh-asawicki version 0.87.1 fixed the issue we were having