department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
282 stars 203 forks source link

Configuration drift in devops Redis Terraforms (dupe 1) #93476

Closed jennb33 closed 3 weeks ago

jennb33 commented 4 weeks ago

This ticket is to continue the work that started in Sprint 10 (93265)

User Story

As the team responsible for the Redis POAM, We need to clean up configuration drift of our Redis clusters, So that we can implement the POAM.

Issue Description

In the context of the ticket for security enhancements in our Redis clusters, it was discovered that the Terraforms in our the devops repo have experience configuration drift. This blocks the implementation of the Redis POAM and needs to be resolved.

Tasks

Success Metrics

Acceptance Criteria

Validation

Assignee to add steps to this section. List the actions that need to be taken to confirm this issue is complete. Include any necessary links or context. State the expected outcome.

jennb33 commented 3 weeks ago

9/30/2024: @flooose brought up the Config drift at DevOps. Request was to make an Epic for all Platform teams currently, then work will continue on this. We will not include DataDog TF drift in that Epic.

flooose commented 3 weeks ago

The following changes in dsva-vagov-dev and dsva-vagov-staging environments were detected by Terraform.

Dev:

$ tf plan -out tfplan.out \
   -target 'module.vets-api.aws_elasticache_replication_group.vets-api-redis' \
   -target 'module.vets-api.aws_elasticache_replication_group.vets-api-redis-sidekiq' \
   -target 'module.vets-api.aws_elasticache_replication_group.vets-api-rails-cache'
...
  # aws_security_group.va-db-scan-sg must be replaced
-/+ resource "aws_security_group" "va-db-scan-sg" {
      ~ arn                    = "arn:aws-us-gov:ec2:us-gov-west-1:008577686731:security-group/sg-0e9175ae037960020" -> (known after apply)
      ~ description            = "open ports required to perform va database scan" -> "DEV - open ports required to perform VA database scan" # forces replacement
        egress                 = [
            {
                cidr_blocks      = [
                    "0.0.0.0/0",
                ]
                description      = ""
                from_port        = 0
                ipv6_cidr_blocks = []
                prefix_list_ids  = []
                protocol         = "-1"
                security_groups  = []
                self             = false
                to_port          = 0
            },
        ]
      ~ id                     = "sg-0e9175ae037960020" -> (known after apply)
      ~ ingress                = [
          + {
              + cidr_blocks      = [
                  + "10.247.2.35/32",
                  + "10.247.2.59/32",
                ]
              + description      = ""
              + from_port        = 5432
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "tcp"
              + security_groups  = []
              + self             = false
              + to_port          = 5432
            },
          - {
              - cidr_blocks      = [
                  - "10.247.2.59/32",
                  - "10.247.2.35/32",
                ]
              - description      = "DEV - open ports required to perform VA database scan"
              - from_port        = 5432
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = []
              - self             = false
              - to_port          = 5432
            },
        ]
        name                   = "va-db-scan-sg"
      + name_prefix            = (known after apply)
      ~ owner_id               = "008577686731" -> (known after apply)
        revoke_rules_on_delete = false
        tags                   = {
            "Name" = "va-db-scan-sg"
        }
        tags_all               = {
            "Name" = "va-db-scan-sg"
        }
        vpc_id                 = "vpc-b50046d0"
    }

  # module.vets-api.aws_elasticache_subnet_group.redis-subnet-group[0] will be updated in-place
  ~ resource "aws_elasticache_subnet_group" "redis-subnet-group" {
        arn         = "arn:aws-us-gov:elasticache:us-gov-west-1:008577686731:subnetgroup:dsva-vagov-dev-vets-api-redis-sng"
        description = "Managed by Terraform"
        id          = "dsva-vagov-dev-vets-api-redis-sng"
        name        = "dsva-vagov-dev-vets-api-redis-sng"
      ~ subnet_ids  = [
          + "subnet-018a96715e751a021",
          + "subnet-06d96cfe8418908b8",
          + "subnet-0839618fca92a980d",
          + "subnet-0a9cc6a418f2632b6",
          + "subnet-0e72767aaa58042c3",
          + "subnet-0ea5f71c0bf1102d5",
            "subnet-2c4e176a",
            "subnet-8d3dd6e9",
            "subnet-d6f512a0",
        ]
        tags        = {}
        tags_all    = {}
    }

  # module.vets-api.aws_security_group.redis-sg[0] will be updated in-place
  ~ resource "aws_security_group" "redis-sg" {
        arn                    = "arn:aws-us-gov:ec2:us-gov-west-1:008577686731:security-group/sg-ea09888c"
        description            = "dsva-vagov-dev Redis group"
        egress                 = [
            {
                cidr_blocks      = [
                    "0.0.0.0/0",
                ]
                description      = ""
                from_port        = 0
                ipv6_cidr_blocks = []
                prefix_list_ids  = []
                protocol         = "-1"
                security_groups  = []
                self             = false
                to_port          = 0
            },
        ]
        id                     = "sg-ea09888c"
      ~ ingress                = [
            {
                cidr_blocks      = []
                description      = ""
                from_port        = 6379
                ipv6_cidr_blocks = []
                prefix_list_ids  = []
                protocol         = "tcp"
                security_groups  = (known after apply)
                self             = false
                to_port          = 6379
            },
          - {
              - cidr_blocks      = []
              - description      = ""
              - from_port        = 6379
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = [
                  - "sg-08cc6de9add4a032c",
                  - "sg-0a313736ce4b5605b",
                  - "sg-0aba386d",
                  - "sg-0e9175ae037960020",
                  - "sg-f3bc3e94",
                ]
              - self             = false
              - to_port          = 6379
            },
        ]
        name                   = "dsva-vagov-dev-redis-sg"
        owner_id               = "008577686731"
        revoke_rules_on_delete = false
        tags                   = {
            "Adminstration" = "TBD"
            "Environment"   = "DEV"
            "Name"          = "dsva-vagov-dev-redis-sg"
            "Office"        = "tbd"
            "ProjectName"   = "VA Vets.gov"
            "ProjectShort"  = "Vets.gov"
            "ResTag"        = "shared"
            "Role"          = "tbd"
            "VAECID"        = "AWG20180517003"
            "dsva_version"  = "v2"
            "environment"   = "dev"
            "group"         = "dsva"
            "managed"       = "terraform"
            "office"        = "osva"
            "project"       = "vagov"
            "provider"      = "aws"
            "region"        = "govcloud"
            "suboffice"     = "cto"
        }
        tags_all               = {
            "Adminstration" = "TBD"
            "Environment"   = "DEV"
            "Name"          = "dsva-vagov-dev-redis-sg"
            "Office"        = "tbd"
            "ProjectName"   = "VA Vets.gov"
            "ProjectShort"  = "Vets.gov"
            "ResTag"        = "shared"
            "Role"          = "tbd"
            "VAECID"        = "AWG20180517003"
            "dsva_version"  = "v2"
            "environment"   = "dev"
            "group"         = "dsva"
            "managed"       = "terraform"
            "office"        = "osva"
            "project"       = "vagov"
            "provider"      = "aws"
            "region"        = "govcloud"
            "suboffice"     = "cto"
        }
        vpc_id                 = "vpc-b50046d0"
    }

Plan: 1 to add, 2 to change, 1 to destroy.

Staging:

$ tf plan -out tfplan.out \
   -target 'module.vets-api.aws_elasticache_replication_group.vets-api-redis' \
   -target 'module.vets-api.aws_elasticache_replication_group.vets-api-redis-sidekiq' \
   -target 'module.vets-api.aws_elasticache_replication_group.vets-api-rails-cache'

...
  # aws_security_group.va-db-scan-sg must be replaced
-/+ resource "aws_security_group" "va-db-scan-sg" {
      ~ arn                    = "arn:aws-us-gov:ec2:us-gov-west-1:008577686731:security-group/sg-0076ffd52cf89dd91" -> (known after apply)
      ~ description            = "open ports required to perform va database scan" -> "STAGING - open ports required to perform VA database scan" # forces replacement
        egress                 = [
            {
                cidr_blocks      = [
                    "0.0.0.0/0",
                ]
                description      = ""
                from_port        = 0
                ipv6_cidr_blocks = []
                prefix_list_ids  = []
                protocol         = "-1"
                security_groups  = []
                self             = false
                to_port          = 0
            },
        ]
      ~ id                     = "sg-0076ffd52cf89dd91" -> (known after apply)
      ~ ingress                = [
          - {
              - cidr_blocks      = [
                  - "10.245.129.107/32",
                ]
              - description      = ""
              - from_port        = 5432
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = []
              - self             = false
              - to_port          = 5432
            },
          + {
              + cidr_blocks      = [
                  + "10.247.2.35/32",
                  + "10.247.2.59/32",
                ]
              + description      = ""
              + from_port        = 5432
              + ipv6_cidr_blocks = []
              + prefix_list_ids  = []
              + protocol         = "tcp"
              + security_groups  = []
              + self             = false
              + to_port          = 5432
            },
          - {
              - cidr_blocks      = [
                  - "10.247.2.59/32",
                  - "10.247.2.35/32",
                ]
              - description      = "STAGING - open ports required to perform VA database scan"
              - from_port        = 5432
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = []
              - self             = false
              - to_port          = 5432
            },
        ]
        name                   = "va-db-scan-sg"
      + name_prefix            = (known after apply)
      ~ owner_id               = "008577686731" -> (known after apply)
        revoke_rules_on_delete = false
        tags                   = {
            "Name" = "va-db-scan-sg"
        }
        tags_all               = {
            "Name" = "va-db-scan-sg"
        }
        vpc_id                 = "vpc-840147e1"
    }

  # module.vets-api.aws_security_group.redis-sg[0] will be updated in-place
  ~ resource "aws_security_group" "redis-sg" {
        arn                    = "arn:aws-us-gov:ec2:us-gov-west-1:008577686731:security-group/sg-d971f0bf"
        description            = "dsva-vagov-staging Redis group"
        egress                 = [
            {
                cidr_blocks      = [
                    "0.0.0.0/0",
                ]
                description      = ""
                from_port        = 0
                ipv6_cidr_blocks = []
                prefix_list_ids  = []
                protocol         = "-1"
                security_groups  = []
                self             = false
                to_port          = 0
            },
        ]
        id                     = "sg-d971f0bf"
      ~ ingress                = [
            {
                cidr_blocks      = []
                description      = ""
                from_port        = 6379
                ipv6_cidr_blocks = []
                prefix_list_ids  = []
                protocol         = "tcp"
                security_groups  = (known after apply)
                self             = false
                to_port          = 6379
            },
          - {
              - cidr_blocks      = []
              - description      = ""
              - from_port        = 6379
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = [
                  - "sg-0076ffd52cf89dd91",
                  - "sg-0445bbedb795a237d",
                  - "sg-4c5bd12b",
                  - "sg-c648c2a1",
                ]
              - self             = false
              - to_port          = 6379
            },
        ]
        name                   = "dsva-vagov-staging-redis-sg"
        owner_id               = "008577686731"
        revoke_rules_on_delete = false
        tags                   = {
            "Adminstration" = "VBA"
            "Environment"   = "STAGING"
            "Name"          = "dsva-vagov-staging-redis-sg"
            "Office"        = "tbd"
            "ProjectName"   = "vagov"
            "ProjectShort"  = "Vets.gov"
            "ResTag"        = "shared"
            "Role"          = "tbd"
            "VAECID"        = "AWG20180517003"
            "dsva_version"  = "v2"
            "environment"   = "staging"
            "group"         = "dsva"
            "managed"       = "terraform"
            "office"        = "osva"
            "project"       = "vagov"
            "provider"      = "aws"
            "region"        = "govcloud"
            "suboffice"     = "cto"
        }
        tags_all               = {
            "Adminstration" = "VBA"
            "Environment"   = "STAGING"
            "Name"          = "dsva-vagov-staging-redis-sg"
            "Office"        = "tbd"
            "ProjectName"   = "vagov"
            "ProjectShort"  = "Vets.gov"
            "ResTag"        = "shared"
            "Role"          = "tbd"
            "VAECID"        = "AWG20180517003"
            "dsva_version"  = "v2"
            "environment"   = "staging"
            "group"         = "dsva"
            "managed"       = "terraform"
            "office"        = "osva"
            "project"       = "vagov"
            "provider"      = "aws"
            "region"        = "govcloud"
            "suboffice"     = "cto"
        }
        vpc_id                 = "vpc-840147e1"
    }

Plan: 1 to add, 1 to change, 1 to destroy.

After running a tf apply in the dev environment, it was revealed that these changes are only perceived changes by Terraform and no changes were actually made:

$ tf apply tfplan.out 
...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Releasing state lock. This may take a few moments...

This is a limitation of some versions of Terraform and is treated as a bug by its developers. Further research into upgrading to a current version of Terraform has been documented in a Confluence document named "Terraform 0.13.x -> 1.8.5" and has indicated that the perceived configuration drift does indeed go away with a more current version of Terraform.

jennb33 commented 3 weeks ago

10/2/2024 update: ticket has been added to the Terraform Drift epic and this work is complete per @flooose so closing the ticket.