Defining AWS provider inside module causes AccessDenied if module is removed

hashibot commented 7 years ago

This issue was originally opened by @jonatanblue as hashicorp/terraform#10097. It was migrated here as part of the provider split. The original body of the issue is below.

Terraform Version

0.7.10

Affected Resource(s)

aws_route53_record

This may affect other resources as it has to do with setting provider inside modules.

Terraform Configuration Files

modules/route53/main.tf:

provider "aws" {
  access_key = "${var.DNS_AWS_ACCESS_KEY_ID}"
  secret_key = "${var.DNS_AWS_SECRET_ACCESS_KEY}"
  region = "eu-west-1"
}

resource "aws_route53_record" "record" {
    zone_id = "redacted"
    name = "${var.sub_domain}.example.com"
    type = "${var.dns_type}"
    ttl = "${var.ttl}"
    records = ["${var.records}"]
}

modules/route53/variables.tf:

variable "sub_domain" {}
variable "dns_type" {}  # e.g "CNAME" or "A"
variable "ttl" {default="300"}
variable "records" {default=[]}
variable "DNS_AWS_ACCESS_KEY_ID" {}
variable "DNS_AWS_SECRET_ACCESS_KEY" {}

root.tf:

provider "aws" {
  region = "${var.aws_region}"
}
...
module "my_dns_record" {
  source = "./modules/route53"
  sub_domain = "thing-${var.env}"
  records = ["some-other-dns.example.com"]
  dns_type = "CNAME"
  DNS_AWS_ACCESS_KEY_ID = "${var.DNS_AWS_ACCESS_KEY_ID}"
  DNS_AWS_SECRET_ACCESS_KEY = "${var.DNS_AWS_SECRET_ACCESS_KEY}"
}
...

vars.tfvars:

env = "dev"
aws_account_id = "redacted"
vpc_id = "redacted"
...

The following environment variables are exported before running terraform:

AWS_ACCESS_KEY_ID=redacted
AWS_SECRET_ACCESS_KEY=redacted
TF_VAR_DNS_AWS_ACCESS_KEY_ID=redacted
TF_VAR_DNS_AWS_SECRET_ACCESS_KEY=redacted

This setup uses two separate AWS accounts.

The first keypair above gives access to the first account, where all resources except DNS records are created. The second keypair (passed in as TF_VARs) used in the aws_route53_record invoking module are for a different AWS account used to manage only DNS records.

Expected Behavior

After having successfully run terraform plan and terraform apply with the configuration above, I remove the my_dns_record module entry from root.tf.

When I run terraform plan again I expect the plan to say 1 resource to remove.

Actual Behavior

terraform plan returns the following error:

* aws_route53_record.record: AccessDenied: User: arn:aws:iam::000011110000:user/myuser is not authorized to access this resource
    status code: 403, request id: abcd1234-aa4d-11e6-9809-d7a9253cdcf5
Error refreshing state: 1 error(s) occurred:

Where 000011110000 is the ID of the first AWS account, which here (correctly) is denied access to the resource in the second account.

I believe this is because when the module is removed, terraform no longer has access to the provider inside the module. This may be intended, so if there's a workaround for this, or a better way to handle this use case, please let me know.

This means that there is no way to delete the resource if specifying a different provider inside the module definition.

Steps to Reproduce

terraform plan and terraform apply with configuration above creating some resources in one AWS account, and DNS resources in another.
Remove the DNS resources defined above
Run terraform plan again
Receive 403.

ghost commented 6 years ago

I have run into the same error, with slightly different circumstances. In my case, My module has no provider defined, but the template which utilizes the module has TWO aws providers, one for the account that holds resources and a separate provider that uses the SAME credentials but has an assume_role block which specifies the role in the DNS hosting account. I gave the provider for modifying dns an alias of "old_account" and then define my resource as such

provider "aws" {
  region = "${var.aws_region}"
}

provider "aws" {
  alias = "old_account"
  region = "${var.aws_old_account_region}"
  assume_role {
    role_arn = "arn:aws:iam::redacted_account_id:role/RedactedRole"
  }
}

resource "aws_route53_zone" "main" {
  provider = "aws.old_account"
  name = "example.com"
}

Then I tried to

terraform import aws_route53_zone.main <redacted_zone_id>

The results are a mix of success and failure - it seems to correctly import the thing, but then it fails when refreshing state for the template after the import has completed. It is saying AccessDenied, but it doesn't log what AWS call is actually failing. I imagine I just need to give extra permissions to the role in question, since it only has route53 access, at the moment, but I need more of a clue as to WHICH permissions the provider is trying to use.

[terragrunt] 2017/11/14 13:04:04 Running command: terraform import -var-file=/Users/sgendler/src/stem/stem-envs/aws2/us-east-1/_global/vpc_mgmt/../../../account.tfvars -var-file=/Users/sgendler/src/stem/stem-envs/aws2/us-east-1/_global/vpc_mgmt/../../region.tfvars -var-file=/Users/sgendler/src/stem/stem-envs/aws2/us-east-1/_global/vpc_mgmt/../env.tfvars -var-file=/Users/sgendler/src/stem/stem-envs/aws2/us-east-1/_global/vpc_mgmt/terraform.tfvars -lock-timeout=20m aws_route53_zone.main redacted
aws_route53_zone.main: Importing from ID "redacted"...
aws_route53_zone.main: Import complete!
  Imported aws_route53_zone (ID: redacted)
aws_route53_zone.main: Refreshing state... (ID: <redacted>)
Error importing: 1 error(s) occurred:

* aws_route53_zone.main (import id: <redacted>): 1 error(s) occurred:

* import aws_route53_zone.stemops result: <redacted zone id>: aws_route53_zone.main: AccessDenied: User: arn:aws:iam::redacted:user/redacted is not authorized to access this resource
    status code: 403, request id: <redacted UUID>
[terragrunt] 2017/11/14 13:04:47 exit status 1

jaydeland commented 6 years ago

For me this is failing with a Tectonic deployment

module.masters.aws_route53_record.ingress_private: 1 error(s) occurred:
aws_route53_record.ingress_private: AccessDenied: User: arn:aws:iam::XXXXXXXXXXXX:user/XXXX is not authorized to access this resource status code: 403, request id: 4bd505ee-c983-11e7-99dc-974b1381040c
module.etcd.aws_route53_record.etc_a_nodes[0]: 1 error(s) occurred:

ghost commented 6 years ago

It turns out that I was able to APPLY the template with no difficult. It actually created a second hosted zone with the exact same domain name. I have no clue how Amazon is going to actually handle that, but since I'm not actually working with a domain that is in use in production just yet, that's actually good enough for me. I'll just migrate the records from our original hosted zone into the terraform-managed zone and then delete the old one. But importing sure would be a big benefit. My guess is that it is correctly using the assumed role credentials for whatever happens in the first pass, but then it loses those credentials when it attempts to refresh the state, so it gets denied since it is no longer running as the assumed role. Fortunately, plan and apply seem to work correctly, so it is just an import problem. The fact that planning and applying works cross-account also implies that my credentials are set up correctly, so it must be a problem in terraform (or terragrunt, in my case, but I suspect terraform here).

jaydeland commented 6 years ago

Thanks @sgendler-stem.

I also found that in my last apply that failed the zone was removed, so subsequent runs with tectonic_aws_external_private_zone set to the old ID failed with the permissions error above. Once I removed this setting in my var file it created a new zone.

ghost commented 6 years ago

Just to confirm - I've done lots of updates and created a variety of subdomains now, and everything works just fine, despite the fact that my TLD is hosted in one AWS account and the subdomains are hosted in another. Import doesn't work, but plan and apply do, as does taint. I haven't tried destroy.

bflad commented 4 years ago

Hi folks 👋 Reading through this older issue, it appears related to functionality that is handled in Terraform core where that upstream logic is responsible for removing old provider configurations from the Terraform state when removing modules. Since the original issue was filed against Terraform 0.7 there have been many major releases to this logic and it is likely that any root causes would have changed in the meantime. If there are still lingering issues relating to this on more recent versions of Terraform (0.12.28 is the latest as of this writing), please file a new issue and we can take a fresh look. Thanks.

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

hashicorp / terraform-provider-aws