gruntwork-io / terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
https://terragrunt.gruntwork.io/
MIT License
7.99k stars 970 forks source link

Double assume role in EKS #2494

Open Onlinehead opened 1 year ago

Onlinehead commented 1 year ago

I have a problem with double role assumption in EKS.

Environment description

The installation:

  1. We have a set of multi-env terraforms we run with Terragrunts.
  2. For AWS provider in each env we have a separate TF role with a set of permissions and use assume-role there.
  3. We have a EKS cluster where Terragrunt deployed by the Helm chart in the account 222222222.
  4. IAM roles association done based on the AWS recommendations - https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

Roles:

Provider block:

provider "aws" {
  region  = var.aws_region
  version = "~> 3.34"
  profile = "metabase_mfa"
  assume_role {
    role_arn     = "arn:aws:iam::11111111111:role/terraform-dev"
    session_name = "${get_env("USER", "terragrunt-dev")}"
  }
}

Terragrunt version - terragrunt version v0.24.4

Expected behaviour

Terragrunt should assume the role arn:aws:iam::222222222:role/eks-terragrunt based on the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE and then using that role it should assume roles from the Terraform code which are set for providers.

Actual behaviour

Terragrunt ignores AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE environment variables and stops with: Error finding AWS credentials (did you set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables?): AccessDenied: User: arn:aws:sts::222222222:assumed-role/eks-cluster-service-nodes/i-0b90ac49749e1df22 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::11111111111:role/terraform-dev.

At the same time, from the same pod, command aws sts assume-role --role-arn arn:aws:iam::11111111111:role/terraform-dev --role-session-name testing works well and return a valid JSON.

Also, I tried to call it with a --terragrunt-iam-role CLI arg.

Onlinehead commented 1 year ago

I want to share a work-around for the issue (in case someone else will meet that problem). Because Terragrunt works well with a standard AWS credentials files, it is possible to create a standard AWS $HOME\credentials file with the following content:

[default]
  role_arn = arn:aws:iam::2222222222:role/eks-terragrunt
  role_session_name = terragrunt
  web_identity_token_file=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
  region = us-east-1
[terragrunt_dev]
  role_arn = arn:aws:iam::11111111111:role/terraform-dev
  source_profile = default
  region = us-west-2

Together with the profile in the provider Terragrunt block:

provider "aws" {
  region  = var.aws_region
  profile = "terragrunt_dev"
}

where you set the profile you need it will work as expected and correctly assume role using dynamic credentials provided by EKS.

jmreicha commented 1 year ago

Just ran into this one. Was really hoping to avoid having to maintain a credentials file and provider profile due to the large and changing number of accounts I am managing. My thought was that maybe it is possible to create the credentials on the fly but so far haven’t had any luck.

eWilliams35 commented 1 year ago

This is brutal... I've got dozens of pipelines we're switching over from IAM users / keys to assumed roles. I'm set up the same way with a service account role on my Jenkins pod and using a withENV block to set AWS_PROFILE. aws sts get-caller-identity works, all aws cli commands work, but terragrunt refuses to honor the profile.

https://terragrunt.gruntwork.io/docs/features/work-with-multiple-aws-accounts/

There is an environment variable the bottom of that article references, essentially specifying the full role arn you want to assume in the pod. I'm stuck looking up the account numbers from the profiles and adding that env var anywhere I run terragrunt.