hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.66k stars 9.03k forks source link

[Bug]: WAFv2 ACL rule state drift #33124

Open AnrichVS opened 11 months ago

AnrichVS commented 11 months ago

Terraform Core Version

1.5.5

AWS Provider Version

5.13.1

Affected Resource(s)

Expected Behavior

Applying a WAFv2 Web ACL without making any changes should not update certain rules.

Actual Behavior

Applying a WAFv2 Web ACL without making any changes always updates certain rules.

So far it seems that rules with a rate based statement, as well as a managed rule referencing the AWSManagedRulesKnownBadInputsRuleSet managed rule set is affected.

Rules that references the AWSManagedRulesBotControlRuleSet and AWSManagedRulesCommonRuleSet managed rule sets do not seem to be affected.

Relevant Error/Panic Output Snippet

No errors occur.

Terraform Configuration Files

tf-acl-state-drift-code.zip

The uploaded ZIP contains a CDKTF (typescript) project that can be used to re-produce the issue. Please see the "Steps to Reproduce".

Steps to Reproduce

  1. Download the tf-acl-state-drift-code.zip archive
  2. Unzip the archive into a directory of your choice and navigate to the directory
  3. Run npm install to install dependencies
  4. Run npm run synth to synthesize the stack and produce the tf-acl-state-drift stack
  5. Apply the stack:
    CDKTF_AWS_REGION=eu-north-1;CDKTF_AWS_PROFILE=default cdktf apply tf-acl-state-drift

Note: you can use the CDKTF_AWS_REGION and CDKTF_AWS_PROFILE environment variables to set your desired deployment region, and which AWS credentials profile to use.

You will see the stack is created:

tf-acl-state-drift    # aws_wafv2_web_acl.waf-acl-state-drift-reproduce (waf-acl-state-drift-reproduce) will be created
                      + resource "aws_wafv2_web_acl" "waf-acl-state-drift-reproduce" {
                          + arn         = (known after apply)
                          + capacity    = (known after apply)
                          + description = "WAF ACL to reproduce state drift"
                          + id          = (known after apply)
                          + lock_token  = (known after apply)
                          + name        = "waf-acl-state-drift-reproduce"
                          + scope       = "REGIONAL"
                          + tags_all    = (known after apply)

                          + default_action {
                              + allow {
                                }
                            }

                          + rule {
                              + name     = "WebAclRateLimitRule"
                              + priority = 0

                              + action {
                                  + block {
                                    }
                                }

                              + statement {
                                  + rate_based_statement {
                                      + aggregate_key_type = "IP"
                                      + limit              = 1000

                                      + forwarded_ip_config {
                                          + fallback_behavior = "MATCH"
                                          + header_name       = "X-Forwarded-For"
                                        }
                                    }
                                }

                              + visibility_config {
                                  + cloudwatch_metrics_enabled = true
                                  + metric_name                = "WebAclRateLimit"
                                  + sampled_requests_enabled   = true
                                }
                            }

                          + visibility_config {
                              + cloudwatch_metrics_enabled = true
                              + metric_name                = "WebAcl"
                              + sampled_requests_enabled   = true
                            }
                        }

                    Plan: 1 to add, 0 to change, 0 to destroy.
  1. Once the ACL has been created successfully, apply the stack again (directly after, make no changes):
    CDKTF_AWS_REGION=eu-north-1;CDKTF_AWS_PROFILE=default cdktf apply tf-acl-state-drift

You will see that TF will update the resource in place, even though no actual changes were made. Note the -> null "changes":

tf-acl-state-drift    # aws_wafv2_web_acl.waf-acl-state-drift-reproduce (waf-acl-state-drift-reproduce) will be updated in-place
                      ~ resource "aws_wafv2_web_acl" "waf-acl-state-drift-reproduce" {
                            id            = "3e545e0f-2224-417f-86f0-847d3f59ba68"
                            name          = "waf-acl-state-drift-reproduce"
                            tags          = {}
                            # (7 unchanged attributes hidden)

                          - rule {
                              - name     = "WebAclRateLimitRule" -> null
                              - priority = 0 -> null

                              - action {
                                  - block {
                                    }
                                }

                              - statement {
                                  - rate_based_statement {
                                      - aggregate_key_type = "IP" -> null
                                      - limit              = 1000 -> null
                                    }
                                }

                              - visibility_config {
                                  - cloudwatch_metrics_enabled = true -> null
                                  - metric_name                = "WebAclRateLimit" -> null
                                  - sampled_requests_enabled   = true -> null
                                }
                            }
                          + rule {
                              + name     = "WebAclRateLimitRule"
                              + priority = 0

                              + action {
                                  + block {
                                    }
                                }

                              + statement {
                                  + rate_based_statement {
                                      + aggregate_key_type = "IP"
                                      + limit              = 1000

                                      + forwarded_ip_config {
                                          + fallback_behavior = "MATCH"
                                          + header_name       = "X-Forwarded-For"
                                        }
                                    }
                                }

                              + visibility_config {
                                  + cloudwatch_metrics_enabled = true
                                  + metric_name                = "WebAclRateLimit"
                                  + sampled_requests_enabled   = true
                                }
                            }

                            # (2 unchanged blocks hidden)
                        }

                    Plan: 0 to add, 1 to change, 0 to destroy.

Debug Output

tf-apply-logs.zip

The ZIP archive contains two debug log files:

  1. tf-apply.log - logs for the first time the stack is applied (i.e. the ACL is created for the first time)
  2. tf-apply-again.log - logs for the second time the stack is applied (i.e. here you can see the state drift)

Note specifically in tf-apply-again.log (line 6376):

2023-08-22T07:44:46.917+0200 [WARN]  Provider "registry.terraform.io/hashicorp/aws" produced an unexpected new value for aws_wafv2_web_acl.waf-acl-state-drift-reproduce during refresh.

Also see from lines 6377 - 6403.

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

No

github-actions[bot] commented 11 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

cnocula-peg commented 10 months ago

I can confirm the observation. It is very annoying since the WAF rules API seem to have inconsistent performance. Sometimes this drift costs us 10 seconds, sometimes up to 5 minutes, slowing our builds down for no reason.

AnrichVS commented 10 months ago

I can confirm the observation. It is very annoying since the WAF rules API seem to have inconsistent performance. Sometimes this drift costs us 10 seconds, sometimes up to 5 minutes, slowing our builds down for no reason.

For now I'm ignoring changes to the ACL rules. I'm considering adding a flag (via environment variable) to not ignore the ACL rules, and then this flag can be passed if ACL rule changes actually need to be made.

Very irritating, but maybe worth considering to avoid unnecessary long builds. Disclaimer: if someone makes changes via AWS console / otherwise, these changes will of course not be reverted since they are ignored.

new Wafv2WebAcl(this, 'web-waf-acl', {
  name: 'Web WAF ACL',
  description: 'Web WAF ACL',
  scope: 'REGIONAL',
  defaultAction: {
    allow: {},
  },
  visibilityConfig: {
    cloudwatchMetricsEnabled: true,
    metricName: 'WebAcl',
    sampledRequestsEnabled: true,
  },
  rule: rules,
  // TODO: this is a workaround to avoid stack drift, it should be removed once this is fixed: https://github.com/hashicorp/terraform-provider-aws/issues/33124
  lifecycle: {
    ignoreChanges: ['rule'],
  },
});
bleachbyte commented 10 months ago

This is happening for us, as well.

Nearly every time we run a plan that includes an aws_wafv2_web_acl resource, Terraform wants to remove some of the rules, and then re-add them, exactly the same as they were before. The removals show the values in the rules changing to null, and the added rules are exactly the same -- name, priority, override_action, etc. are all identical.

Example plan is below (with IDs/AWS account numbers redacted):

  ~ resource "aws_wafv2_web_acl" "alb" {
        id            = "<REDACTED>"
        name          = "dev-waf"
        tags          = {
            "environment" = "dev"
        }
        # (7 unchanged attributes hidden)

      - rule {
          - name     = "AWSManagedRulesKnownBadInputsRuleSet" -> null
          - priority = 400 -> null

          - override_action {
              - none {}
            }

          - statement {
              - managed_rule_group_statement {
                  - name        = "AWSManagedRulesKnownBadInputsRuleSet" -> null
                  - vendor_name = "AWS" -> null
                }
            }

          - visibility_config {
              - cloudwatch_metrics_enabled = true -> null
              - metric_name                = "dev-waf-AWSManagedRulesKnownBadInputsRuleSet-metric" -> null
              - sampled_requests_enabled   = true -> null
            }
        }
      - rule {
          - name     = "AWSManagedRulesLinuxRuleSet" -> null
          - priority = 500 -> null

          - override_action {
              - none {}
            }

          - statement {
              - managed_rule_group_statement {
                  - name        = "AWSManagedRulesLinuxRuleSet" -> null
                  - vendor_name = "AWS" -> null
                }
            }

          - visibility_config {
              - cloudwatch_metrics_enabled = true -> null
              - metric_name                = "dev-waf-AWSManagedRulesLinuxRuleSet-metric" -> null
              - sampled_requests_enabled   = true -> null
            }
        }
      - rule {
          - name     = "AWSManagedRulesSQLiRuleSet" -> null
          - priority = 600 -> null

          - override_action {
              - none {}
            }

          - statement {
              - managed_rule_group_statement {
                  - name        = "AWSManagedRulesSQLiRuleSet" -> null
                  - vendor_name = "AWS" -> null

                  - scope_down_statement {
                      - not_statement {
                          - statement {
                              - regex_pattern_set_reference_statement {
                                  - arn = "arn:aws:wafv2:us-west-2:<REDACTED>:regional/regexpatternset/sandbox-relaxed-uri-paths/<REDACTED>" -> null

                                  - field_to_match {
                                      - uri_path {}
                                    }

                                  - text_transformation {
                                      - priority = 0 -> null
                                      - type     = "LOWERCASE" -> null
                                    }
                                }
                            }
                        }
                    }
                }
            }

          - visibility_config {
              - cloudwatch_metrics_enabled = true -> null
              - metric_name                = "dev-waf-AWSManagedRulesSQLiRuleSet-metric" -> null
              - sampled_requests_enabled   = true -> null
            }
        }
      + rule {
          + name     = "AWSManagedRulesKnownBadInputsRuleSet"
          + priority = 400

          + override_action {
              + none {}
            }

          + statement {
              + managed_rule_group_statement {
                  + name        = "AWSManagedRulesKnownBadInputsRuleSet"
                  + vendor_name = "AWS"
                }
            }

          + visibility_config {
              + cloudwatch_metrics_enabled = true
              + metric_name                = "dev-waf-AWSManagedRulesKnownBadInputsRuleSet-metric"
              + sampled_requests_enabled   = true
            }
        }
      + rule {
          + name     = "AWSManagedRulesLinuxRuleSet"
          + priority = 500

          + override_action {
              + none {}
            }

          + statement {
              + managed_rule_group_statement {
                  + name        = "AWSManagedRulesLinuxRuleSet"
                  + vendor_name = "AWS"
                }
            }

          + visibility_config {
              + cloudwatch_metrics_enabled = true
              + metric_name                = "dev-waf-AWSManagedRulesLinuxRuleSet-metric"
              + sampled_requests_enabled   = true
            }
        }
      + rule {
          + name     = "AWSManagedRulesSQLiRuleSet"
          + priority = 600

          + override_action {
              + none {}
            }

          + statement {
              + managed_rule_group_statement {
                  + name        = "AWSManagedRulesSQLiRuleSet"
                  + vendor_name = "AWS"

                  + scope_down_statement {
                      + not_statement {
                          + statement {
                              + regex_pattern_set_reference_statement {
                                  + arn = "arn:aws:wafv2:us-west-2:<REDACTED>:regional/regexpatternset/sandbox-relaxed-uri-paths/<REDACTED>"

                                  + field_to_match {
                                      + uri_path {}
                                    }

                                  + text_transformation {
                                      + priority = 0
                                      + type     = "LOWERCASE"
                                    }
                                }
                            }
                        }
                    }
                }
            }

          + visibility_config {
              + cloudwatch_metrics_enabled = true
              + metric_name                = "dev-waf-AWSManagedRulesSQLiRuleSet-metric"
              + sampled_requests_enabled   = true
            }
        }

        # (10 unchanged blocks hidden)
    }

This has been plaguing us for several months now, and has not changed, even with the new AWS provider version of 5.19.0.

lobeto99 commented 3 months ago

Just wanted to add a fresh comment here as we've been experiencing this issue for a while now too

fdarif commented 1 month ago

Could you please prioritize this issue? I'm experiencing the same problem.

ST-TaylorAnn commented 3 weeks ago

I am currently using HashiCorp AWS with version specifications "~> 5.25" and the installed version is v5.57.0. I am experiencing the same issue, which did not occur a few weeks ago. Recently, however, the API has been periodically unstable.