hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.74k stars 9.1k forks source link

[Bug]: EC2 root volume creation fails when TagPolicies are present #35639

Open ETisREAL opened 7 months ago

ETisREAL commented 7 months ago

Terraform Core Version

1.7.2

AWS Provider Version

5.34.0

Affected Resource(s)

aws_instance

Expected Behavior

I have this SCP Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceMandatoryTagsPresence",
      "Effect": "Deny",
      "Action": [
        "ec2:RunInstances"
      ],
      "Resource": [
        "arn:aws:ec2:*:*:instance/*",
        "arn:aws:ec2:*:*:volume/*"
      ],
      "Condition": {
        "Null": {
          "aws:RequestTag/infrastructure": "true",
          "aws:RequestTag/stage": "true"
        }
      }
    },
    {
      "Sid": "DenyMandatoryTagsRemoval",
      "Effect": "Deny",
      "Action": [
        "ec2:DeleteTags"
      ],
      "Resource": [
        "arn:aws:ec2:*:*:instance/*",
        "arn:aws:ec2:*:*:volume/*"
      ],
      "Condition": {
        "Null": {
          "aws:RequestTag/infrastructure": "false",
          "aws:RequestTag/stage": "false"
        }
      }
    }
  ]
}

This is how I am creating the EC2 instance:

resource "aws_instance" "ec2_instance" {
  ami           = data.aws_ami.al2023_ami.id
  instance_type = var.instance_type

  root_block_device {
    volume_size = 10
    tags        = merge({
      Name = "EC2_root_block_device"
    }, var.tags)
  }

  tags = merge({
    Name = "EC2_instance"
  }, var.tags)
}

and finally these are the tags I am setting:

module "backups" {
  source        = "./modules/ec2-backups"
  instance_type = "t2.micro"
  tags          = {
    infrastructure = "tf"
    stage          = "STG"
  }
}

The required tags are getting set on the resources, both the EC2 instance and the root EBS volume (even the plan shows it)

Actual Behavior

The request fails with an explicit deny.

This only happens if the policy applies to volumes as well. If I remove the volume restriction, it creates the resources as it should.

Apparently this issue should have gotten fixed with this PR #6396 Unfortunately the issue is alive and well, at least for the root volume

Relevant Error/Panic Output Snippet

β•·
β”‚ Error: creating EC2 Instance: UnauthorizedOperation: You are not authorized to perform this operation. User: arn:aws:sts::851725412353:assumed-role/AWSReservedSSO_Staging-AdministratorAccess_2760ed54e811856f/staging-user is not authorized to perform: ec2:RunInstances on resource: arn:aws:ec2:eu-central-1:851725412353:volume/* with an explicit deny in a service control policy. Encoded authorization failure message: 8P057QR9qwAzU6KiP-hTV9qfcnwr2oZ82zTkctGjDHwmxfxHeklP0xMbjLrWvgSaHdcy6cizTBN52bb7KTXevRa5IPZS3_QMcA7S7ULX2YI-f6pV6mvZ52cLjl0XQ-hZya7dgWE3fJzA4q-6im2Evpor09rFNlIhHd77c3FLZUBA31fmL501ButYEdgKCORGWjVjBgKCdgMx1ixVRPn1SVUcLv3TLAMXY623_H7r8-FpHzD3DYUEohi4-ghlxyIaQIc-F1gJf06qqKbT0zycUpU_1yW-_16wYvz-_dsGCuXr7Prmc2MrF__oLI6-ceEAmYX32f7XOR35l1Mj_NE0rUM8WCuGxpOPrQfVIltBVerQwB9PAVtdcV5jUx6Y_Du9S7pRlneKiXEeCN11FbGVlOEX0o3vmueuDVUNdHxpBEuyHWjDxsbBd5iiKDGoLN1h5vX4-XPFBe8iOuoHovgCQ6rz75bMuIy7xFUf_iiV6Gg2jrPyxFrMOU0jR5BSLoMJQD31kgNrBalDGeeS0JEa-6YiaT2KXyqDFxeOLtyeCaQdALK0jWa8BZLWJmMaZsnU_1zctBaRhovTWTUGAAxCLEnECZFKKYpcwVLK_GlVAEqUsoEB5QZvfn4v0kut78b7ukJhi-e0fwOieiscjiM3BeRF1SEzHmxOsFLFEnXkLlZCB2Y_EybBW0SH5L0aohsm2Hs6NV75ibXFcF7m0Z_V0PwWmFQseGiwsVFcrxbJN8pbw5ce5nHiBCSTgZ-cPaqf_37Gn1kfpSpXEYCoXa57T1XCV2UjKWdRN2gf-R68HzRrTitjgMWZaZ1NYC05p6Id9rExgeg8EVPIqayL2syOIfwLOuO-XN5Bpa323hGFhPj2yfRvv88
β”‚       status code: 403, request id: 7c570852-fd34-497b-8d15-2a2b50d4d6c6

Terraform Configuration Files

terraform { backend "s3" { bucket = "janji-terraform-bucket" key = "tfstate" region = "eu-central-1" profile = "janji" } required_providers { aws = { source = "hashicorp/aws" version = "~> 5.30" } } }

provider "aws" { region = var.aws_region profile = "janji" default_tags { tags = { Environment = "Test" Owner = "TFProviders" Project = "Test" } } }

Steps to Reproduce

  1. Create an SCP policy with the snippet I provided
  2. Create an EC2 instance tagging both the instance itself and the root volume

Debug Output

http.response.body= | | | arn:aws:sts::851725412353:assumed-role/AWSReservedSSO_Staging-AdministratorAccess_2760ed54e811856f/staging-user | AROA*****TW5G:staging-user | 851725412353 | | | ebc47c36-37e0-48e5-9393-b0bf87e68b7a | | rpc.service=STS tf_mux_provider="schema.GRPCProviderServer" http.duration=307 http.response.header.content_type=text/xml @caller=github.com/hashicorp/aws-sdk-go-base/v2@v2.0.0-beta.46/logging/tf_logger.go:47 timestamp="2024-02-05T09:26:18.544+0100" 2024-02-05T09:26:18.544+0100 [INFO] provider.terraform-provider-aws_v5.34.0_x5: Retrieved caller identity from STS: @caller=github.com/hashicorp/aws-sdk-go-base/v2@v2.0.0-beta.46/logging/tf_logger.go:39 @module=aws.aws-base tf_mux_provider="schema.GRPCProviderServer" tf_provider_addr=registry.terraform.io/hashicorp/aws tf_req_id=d4b1447d-6d7d-0b22-b84a-df76746761ef tf_rpc=ConfigureProvider timestamp="2024-02-05T09:26:18.544+0100" 2024-02-05T09:26:18.547+0100 [DEBUG] Resource instance state not found for node "module.backups.data.aws_ami.al2023_ami", instance module.backups.data.aws_ami.al2023_ami 2024-02-05T09:26:18.547+0100 [DEBUG] ReferenceTransformer: "module.backups.data.aws_ami.al2023_ami" references: [] 2024-02-05T09:26:18.552+0100 [DEBUG] provider.terraform-provider-aws_v5.34.0_x5: HTTP Request Sent: http.request.header.x_amz_security_token="*" http.request_content_length=185 rpc.service=EC2 http.url=https://ec2.eu-central-1.amazonaws.com/ http.request.header.x_amz_date=20240205T082618Z @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.47/logger.go:109 http.request.header.authorization="AWS4-HMAC-SHA256 Credential=ASIA****WLPY/20240205/eu-central-1/ec2/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-security-token, Signature=****" tf_mux_provider="schema.GRPCProviderServer" @module=aws http.method=POST http.request.body= | Action=DescribeImages&Filter.1.Name=architecture&Filter.1.Value.1=x86_64&Filter.2.Name=name&Filter.2.Value.1=al2023-ami-2023%2A&IncludeDeprecated=false&Owner.1=amazon&Version=2016-11-15 http.request.header.content_type="application/x-www-form-urlencoded; charset=utf-8" rpc.system=aws-api tf_aws.sdk=aws-sdk-go http.flavor=1.1 http.user_agent="APN/1.0 HashiCorp/1.0 Terraform/1.7.2 (+https://www.terraform.io) terraform-provider-aws/5.34.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.49.24 (go1.20.12; linux; amd64)" tf_req_id=a73ca108-6064-e174-5626-61238fe9e338 tf_rpc=ReadDataSource aws.region=eu-central-1 net.peer.name=ec2.eu-central-1.amazonaws.com rpc.method=DescribeImages tf_data_source_type=aws_ami tf_provider_addr=registry.terraform.io/hashicorp/aws timestamp="2024-02-05T09:26:18.552+0100" 2024-02-05T09:26:19.158+0100 [DEBUG] provider.terraform-provider-aws_v5.34.0_x5: HTTP Response Received: http.status_code=200 tf_mux_provider="*schema.GRPCProviderServer" @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.47/logger.go:157 http.response.header.cache_control="no-cache, no-store" tf_provider_addr=registry.terraform.io/hashicorp/aws http.response.header.content_type=text/xml;charset=UTF-8 http.response.header.strict_transport_security="max-age=31536000; includeSubDomains" rpc.system=aws-api tf_req_id=a73ca108-6064-e174-5626-61238fe9e338 http.response.header.server=AmazonEC2 http.response.header.x_amzn_requestid=78f976a7-57ea-4bd2-9d57-61fc27608bb9 tf_data_source_type=aws_ami @module=aws aws.region=eu-central-1 http.response.header.vary=accept-encoding rpc.method=DescribeImages rpc.service=EC2 http.response.body= | <?xml version="1.0" encoding="UTF-8"?> | | 78f976a7-57ea-4bd2-9d57-61fc27608bb9 | | | ami-024f768332f080c5e | amazon/al2023-ami-2023.3.20231218.0-kernel-6.1-x86_64 | available | 137112412989 | 2023-12-15T02:10:49.000Z | true | x86_64 | machine | simple | amazon | al2023-ami-2023.3.20231218.0-kernel-6.1-x86_64 | Amazon Linux 2023 AMI 2023.3.20231218.0 x86_64 HVM kernel-6.1 | ebs | /dev/xvda | | | /dev/xvda | | snap-00099e2296a49daa3 | 8 | true | gp3 | 3000 | false | 125 | | | | hvm | xen | true | Linux/UNIX | RunInstances | uefi-preferred | v2.0 | 2024-03-14T02:11:00.000Z | | | ami-09024b009ae9e7adf | amazon/al2023-ami-2023.3.20240122.0-kernel-6.1-x86_64 | available | 137112412989 | 2024-01-20T00:06:46.000Z | true | x86_64 | machine | simple | amazon | al2023-ami-2023.3.20240122.0-kernel-6.1-x86_64 | Amazon Linux 2023 AMI 2023.3.20240122.0 x86_64 HVM kernel-6.1 | ebs | /dev/xvda | | | /dev/xvda | | snap-0213b3d8f06c22f65 | 8 | true | gp3 | 3000 | false | 125 | | | | hvm | xen | true | Linux/UNIX | RunInstances | uefi-preferred | v2.0 | 2024-04-19T00:07:00.000Z | | | ami-0292a7daeeda6b851 | amazon/al2023-ami-2023.3.20240117.0-kernel-6.1-x86_64 | [truncated...] tf_rpc=ReadDataSource http.duration=605 http.response.header.date="Mon, 05 Feb 2024 08:26:18 GMT" tf_aws.sdk=aws-sdk-go timestamp="2024-02-05T09:26:19.158+0100" 2024-02-05T09:26:19.159+0100 [DEBUG] provider.terraform-provider-aws_v5.34.0_x5: [DEBUG] aws_ami - adding block device mapping: map[device_name:/dev/xvda ebs:map[delete_on_termination:true encrypted:false iops:3000 snapshot_id:snap-0f2cb3b54923f557a throughput:125 volume_size:8 volume_type:gp3] virtual_name:] 2024-02-05T09:26:19.162+0100 [DEBUG] Resource instance state not found for node "module.backups.aws_instance.ec2_instance", instance module.backups.aws_instance.ec2_instance 2024-02-05T09:26:19.162+0100 [DEBUG] ReferenceTransformer: "module.backups.aws_instance.ec2_instance" references: [] 2024-02-05T09:26:19.162+0100 [DEBUG] refresh: module.backups.aws_instance.ec2_instance: no state, so not refreshing 2024-02-05T09:26:19.171+0100 [WARN] Provider "registry.terraform.io/hashicorp/aws" produced an invalid plan for module.backups.aws_instance.ec2_instance, but we are tolerating it because it is using the legacy plugin SDK. The following problems may be the cause of any confusing errors from downstream operations:

Panic Output

No response

Important Factoids

No response

References

PRs that supposedly fixed the behaviour #6396 #17412

Would you like to implement a fix?

None

github-actions[bot] commented 7 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

Pavankumar66 commented 7 months ago

I'm also facing the similar issue with such an SCP policy. Do we have a timeline when the related fix will be implemented in Terraform ?

Note: In CloudFormation this is already being handled with the property PropagateTagsToVolumeOnCreation.

nantiferov commented 2 weeks ago

If you use provider version > v5.39 and tags are generic enough to put them to provider default_tags, it might fix issue with SCP tags enforcement.

However, keep in mind that if you change tags in default_tags, terraform will try to remove them 😞 , issue https://github.com/hashicorp/terraform-provider-aws/issues/38301

P.S. In general, I dig a little bit here and with tags defined in provider default_tags they're set for volume in RunInstances API call, but in generic case they're set properly, but after RunInstances succeeds, example:

# without tags in default_tags:
http.request.body=
  Action=RunInstances
  ...
  TagSpecification.1.ResourceType=instance
  TagSpecification.1.Tag.1.Key=foo
  TagSpecification.1.Tag.1.Value=bar

# with tags in provider default_tags
http.request.body=
  Action=RunInstances
  ...
  TagSpecification.1.ResourceType=instance
  TagSpecification.1.Tag.1.Key=foo
  TagSpecification.1.Tag.1.Value=bar
...
  TagSpecification.2.ResourceType=volume
  TagSpecification.2.Tag.1.Key=foo
  TagSpecification.2.Tag.1.Value=bar
ETisREAL commented 1 week ago

@nantiferov Thanks, will keep it in mind :) Much appreciate the feedback