gruntwork-io / terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
https://terragrunt.gruntwork.io/
MIT License
7.94k stars 965 forks source link

Receiving S3 backend credential error when using an AWS config assumed role profile #616

Open ryno75 opened 5 years ago

ryno75 commented 5 years ago

I receive the error: Error configuring the backend "s3": No valid credential sources found for AWS Provider. When attempting to run terragrunt using an AWS config assumed role profile. The same exact terragrunt command works perfectly fine using a AWS config profile that is either using an IAM user (Access Key + Secret Key) or assumed credentials (Access Key + Secret Key + Token).

Here is a rundown...

> export AWS_PROFILE=aws.premium.staging
> export AWS_DEFAULT_PROFILE=aws.premium.staging
> aws sts get-caller-identity
{
    "UserId": "AROAIYKNPLZYTEIM2JL46:botocore-session-1543851525",
    "Account": "##REDACTED##",
    "Arn": "arn:aws:sts::##REDACTED##:assumed-role/OrganizationAccountAccessRole/botocore-session-1543851525"
}

> TF_LOG=debug TERRAGRUNT_DEBUG=true terragrunt plan
[terragrunt] [/Users/<REDACTED>/code/smar/terraform_premium/live/staging/IAM] 2018/12/03 07:40:13 Running command: terraform --version
[terragrunt] 2018/12/03 07:40:13 Reading Terragrunt config file at /Users/<REDACTED>/code/smar/terraform_premium/live/staging/IAM/terraform.tfvars
[terragrunt] 2018/12/03 07:40:13 WARNING: no double-slash (//) found in source URL /CloudOps/tfm_smar_iam.git. Relative paths in downloaded Terraform code may not work.
[terragrunt] 2018/12/03 07:40:13 Terraform files in /Users/<REDACTED>/code/smar/terraform_premium/live/staging/IAM/.terragrunt-cache/35z1Uw8yNAtpa2jho5yiyAsrb_M/KUZWpCLP92GDGc5x4p0Yt0vOivM are up to date. Will not download again.
[terragrunt] 2018/12/03 07:40:13 Copying files from /Users/<REDACTED>/code/smar/terraform_premium/live/staging/IAM into /Users/<REDACTED>/code/smar/terraform_premium/live/staging/IAM/.terragrunt-cache/35z1Uw8yNAtpa2jho5yiyAsrb_M/KUZWpCLP92GDGc5x4p0Yt0vOivM
[terragrunt] 2018/12/03 07:40:13 Setting working directory to /Users/<REDACTED>/code/smar/terraform_premium/live/staging/IAM/.terragrunt-cache/35z1Uw8yNAtpa2jho5yiyAsrb_M/KUZWpCLP92GDGc5x4p0Yt0vOivM
[terragrunt] 2018/12/03 07:40:15 Running command: terraform plan -var-file=/Users/<REDACTED>/code/smar/terraform_premium/live/staging/terraform.tfvars -var-file=/Users/<REDACTED>/code/smar/terraform_premium/live/staging/../common.tfvars
2018/12/03 07:40:15 [INFO] Terraform version: 0.11.10
2018/12/03 07:40:15 [INFO] Go runtime version: go1.11.1
2018/12/03 07:40:15 [INFO] CLI args: []string{"/usr/local/Cellar/terraform/0.11.10/bin/terraform", "plan", "-var-file=/Users/<REDACTED>/code/smar/terraform_premium/live/staging/terraform.tfvars", "-var-file=/Users/<REDACTED>/code/smar/terraform_premium/live/staging/../common.tfvars"}
2018/12/03 07:40:15 [DEBUG] Attempting to open CLI config file: /Users/<REDACTED>/.terraformrc
2018/12/03 07:40:15 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2018/12/03 07:40:15 [INFO] CLI command args: []string{"plan", "-var-file=/Users/<REDACTED>/code/smar/terraform_premium/live/staging/terraform.tfvars", "-var-file=/Users/<REDACTED>/code/smar/terraform_premium/live/staging/../common.tfvars"}
2018/12/03 07:40:15 [INFO] Building AWS region structure
2018/12/03 07:40:15 [INFO] Building AWS auth structure
2018/12/03 07:40:15 [INFO] Setting AWS metadata API timeout to 100ms
2018/12/03 07:40:15 [INFO] Ignoring AWS metadata API endpoint at default location as it doesn't return any instance-id
2018/12/03 07:40:21 [DEBUG] plugin: waiting for all plugin processes to complete...
Failed to load backend:
Error configuring the backend "s3": No valid credential sources found for AWS Provider.
    Please see https://terraform.io/docs/providers/aws/index.html for more information on
    providing credentials for the AWS Provider

Please update the configuration in your Terraform files to fix this error.
If you'd like to update the configuration interactively without storing
the values in your configuration, run "terraform init".

However... I can use the same assumed role profile with terraform directly (or aws cli as show above) without issue.

ryno75 commented 5 years ago

Worth noting, this is what my AWS config profile looks like:

[profile aws.premium.staging]
region = us-west-2
role_arn = arn:aws:iam::##REDACTED##:role/OrganizationAccountAccessRole
source_profile = master_billing

Any profile that has credentials defined (in ~/.aws/credentials) works just fine.

ryno75 commented 5 years ago

I have discovered a workaround that may shed some light on the root cause. If I set the AWS_SDK_LOAD_CONFIG env var to a truthy value it uses the AWS_DEFAULT_PROFILE and standard (i.e. non-golang) AWS config properties and works as expected using the assumed role profile.

Here's the interesting piece of the aws-sdk-go project that utilizes that var: https://github.com/aws/aws-sdk-go/blob/master/aws/session/env_config.go#L162

Not sure why this is a problem with terragrunt but I can execute terraform code just fine. Perhaps the version of the aws-go-sdk in terragrunt is out of date?

ryno75 commented 5 years ago

Just forked terragrunt, updated the aws-go-sdk to the latest version, did a go build and am still seeing this error when running with that custom built bin. This is likely a problem with how the terragrunt remote package is handling assumed roles?

Another workaround is to use the --terragrunt-iam-role CLI option, TERRAGRUNT_IAM_ROLE env var, or the iam_role= option in the terragrunt{} section of the terragrunt.tfvars file (as outlined here: https://github.com/gruntwork-io/terragrunt#work-with-multiple-aws-accounts). Using any of those methods to assume the desired role instead of using an AWS config assumed role profile works as desired.

brikis98 commented 5 years ago

Might have to do with how we configure the profile to use here, which is read from the backend config, but does not seem to take env vars into account.

JoshiiSinfield commented 5 years ago

Hi, we're also seeing this issue, or one very similar.

We've done lots of trial and error, along with trying to understand the source code and what terragrunt is doing with regards to assuming roles.

We see the issue when we configure the remote state backend to have a different role in a separate account to the iam_role we configure within terragrunt {}.

state configuration

terragrunt = {
  remote_state {
    backend = "s3"
    config {
      bucket  = "account1-bucket-name"
      key     = "josh-test-terragrunt-assume-issues"
      region  = "eu-west-1"
      encrypt = true
      role_arn = "arn:aws:iam::12345678901:role/account1Role"
    }
  }
}

terragrunt block

terragrunt = {
  terraform {
    source = "..."
  }

  iam_role = "arn:aws:iam::12345678902:role/account2Role"
}

I've split the above configuration for visual purposes. The same issue occurs whether including from parent or all in one file.

What we think is happening is during the assumeRoleIfNecessary here / here the assume role is setting environment variables.

Due to the order of precedence with how the aws-sdk gets credentials, these are being used for the remote_state to then assume role further on within runTerragruntWithConfig when it prepares the init command here / here

Please note, we are Golang novices and haven't performed any code debugging, we've come to this conclusion from simply reading the code so may well be very wrong.

This issue does not occur when using provider blocks (static configuration in HCL) due to Terraform's order of precedence.

Cheers, Josh

jonathanhle commented 5 years ago

@JoshiiSinfield I'm running into the same exact issue. Thanks for your post/comment. Did you ever find a workaround?

JoshiiSinfield commented 5 years ago

Hi @jonathanhle,

Not for terragrunt to perform the assume_role no. I was hoping to dive deep into the code but just not had chance.

instead we use a providers.tf with the terraform provider resources as was the pattern before terragrunt was able to assume role.

I do plan on having a go at working it out. I need to up my go skills first though!

Cheers, Josh

rtizzy commented 5 years ago

@JoshiiSinfield

I'm seeing the same failure mode.

@jonathanhle

Here is what I've done to work around this.

I was previously using a static provider.tf file with aliases to manage this.

To make this more dynamic I added the following in a provider.tf file in each of my module

provider "aws" {
  alias  = "alias1"
  region = "${var.a_region}"

  assume_role {
    role_arn = "${var.assume_role_arn}:role/OrganizationAccountAccessRole"
  }
}

To each vars.tf

variable "assume_role_arn" {
  description = "Role ARN used to assume the account."
}
variable "a_region" {
  description = "Region fight should be placed in"
}

I can then pass in the correct Role ARN to utilize via any of the methods Terraform provides (.tfvars, CLI, Env Variables).

JoshiiSinfield commented 5 years ago

Hi @brikis98 Did you get a chance to review this at all?

Unfortunately I've not had chance to dive in and try to fix it yet. I'm keen to hear any ideas/comments you have before I do.

Cheers, Josh

brikis98 commented 5 years ago

@JoshiiSinfield Other than what I wrote in https://github.com/gruntwork-io/terragrunt/issues/616#issuecomment-444185765, no, I haven't had a chance to dig in deeper