gruntwork-io / terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
https://terragrunt.gruntwork.io/
MIT License
7.95k stars 966 forks source link

Error finding AWS credentials for S3 remote state #671

Open VladimirShushkov opened 5 years ago

VladimirShushkov commented 5 years ago

Hi guys,

Kindly asking for your advice. I see that that the issue was previously discussed several times from different angles. I can't make our TG code working with my creds in AWS multiaccount environments without creating dedicated IAM user in all these AWS accounts. I am successfully using assume-role consumption and aws cli is perfectly working so I can pull data from all accounts just by setting appropriate AWS_PROFILE and AWS_DEFAULT_REGION where profiles are configured in ~/.aws/config this way

[profile main]
output = json
region = eu-central-1

[profile dev]
output = json
region = eu-central-1
role_arn = arn:aws:iam::xxxxxxxxxx:role/AdminRole
source_profile = main

What I need to do is to create resources in AWS dev account where we have s3 remote state stored as well using IAM user from main account via assume-role approach.

Terragrunt code I am trying to push looks like this, for example, which has variables specifications only and is pulling TF module config from git repo

terragrunt = {
  terraform {
    source = "git::path"
  }

  # Include all settings from the root terraform.tfvars file
  include = {
    path = "${find_in_parent_folders()}"
  }
}

solution_owner                = "Devops"
vpc_cidr_block                = "10.0.1.0/20"
...

It also takes TG configs from parent folders where I have remote state config

terragrunt = {
  # Configure Terragrunt to automatically store tfstate files in an S3 bucket
  remote_state {
    backend = "s3"
    config {
      encrypt        = true
      bucket         = "dev-terraform-state"
      key            = "${path_relative_to_include()}/terraform.tfstate"
      region         = "eu-central-1"
      dynamodb_table = "terraform-locks"
      profile        = "dev"
    }
  }
  # Configure root level variables that all resources can inherit
  terraform {
    extra_arguments "bucket" {
      commands = ["${get_terraform_commands_that_need_vars()}"]
      optional_var_files = [
          "${get_tfvars_dir()}/${find_in_parent_folders("account.tfvars", "ignore")}"
      ]
    }
  }
}

During execution of terragrunt plan I am getting the error pointing to absent creds for reaching remote state s3


[terragrunt] [...] 2019/03/06 12:01:40 Running command: terraform --version
[terragrunt] 2019/03/06 12:01:40 Reading Terragrunt config file at .../core/vpc/terraform.tfvars
[terragrunt] 2019/03/06 12:01:40 WARNING: no double-slash (//) found in source URL /tf/module-aws-vpc.git. Relative paths in downloaded Terraform code may not work.
[terragrunt] 2019/03/06 12:01:40 Cleaning up existing *.tf files in .../core/vpc/.terragrunt-cache/VSGljja7WSjrKw1hwRMSENKXBAA/yAyfAK-S9z7ucSeeqfftLQhM-MA
[terragrunt] 2019/03/06 12:01:40 Downloading Terraform configurations from git::ssh://asdasd.git?ref=pr/12 into .../core/vpc/.terragrunt-cache/VSGljja7WSjrKw1hwRMSENKXBAA/yAyfAK-S9z7ucSeeqfftLQhM-MA using terraform init
[terragrunt] [.../core/vpc] 2019/03/06 12:01:40 Initializing remote state for the s3 backend
[terragrunt] 2019/03/06 12:02:00 Error finding AWS credentials (did you set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables?): NoCredentialProviders: no valid providers in chain. Deprecated.
        For verbose messaging see aws.Config.CredentialsChainVerboseErrors
[terragrunt] 2019/03/06 12:02:00 Unable to determine underlying exit code, so Terragrunt will exit with error code 1

I tried to use all 3 approaches described in section "Work with multiple AWS accounts" of main TG repo page with setting AWS_PROFILE env. variable, using sts assume-role etc, but I am stuck on the same stage of init remote s3.

As I see TG supported assume-role approach for quite long time. Is there any caveats or limitations with that? What am I doing wrong?

Please help.

Thank you very much! #

brikis98 commented 5 years ago
  remote_state {
    backend = "s3"
    config {
      encrypt        = true
      bucket         = "dev-terraform-state"
      key            = "${path_relative_to_include()}/terraform.tfstate"
      region         = "eu-central-1"
      dynamodb_table = "terraform-locks"
      profile        = "dev"
    }
  }

Is the profile = "dev" intentional?

jvanwagner commented 5 years ago

I'm getting a similar issue myself except I get this error:

Error configuring the backend "s3": No valid credential sources found for AWS Provider. Please see https://terraform.io/docs/providers/aws/index.html for more information on providing credentials for the AWS Provider

Please update the configuration in your Terraform files to fix this error then run this command again.

JoshiiSinfield commented 5 years ago

@VladimirShushkov @jvanwagner have you seen #616 ? It sounds like you could be experiencing similar.

Cheers, Josh

rafaeloening-barigui commented 5 years ago

+1

brianpham commented 4 years ago

Any updates on this? I am seeing the same issue on my end

khdevel commented 4 years ago

I think I figured it out... I will test it deeply but for now it works! My scenario is as follow:

So I think that is something like above problem plus MFA which increases the complexity... but the solution is quite simple - I had to have a [default] profile in my ~/.aws/credentials which seems to be mandatory for Terragrunt! Of course other profiles are valid but this one is something like a must-have

My ~/.aws/credentials file was as follow:

[default]
aws_access_key_id = AKIA...
aws_secret_access_key = ry5tgFree...

[assumeHelper]
role_arn = arn:aws:iam::123456789012:role/FooRole
mfa_serial = arn:aws:iam::098765432109:mfa/foo.bar
region = eu-central-1
source_profile = default

[mfaAssume]
aws_access_key_id = ASIA...
aws_secret_access_key = wi47gPZR...
aws_session_token = FwoGZXIv...

[default] - contains the data for the IAM User at Management AWS Account [assumeHelper] - it is a "notepad" for the aws sts command, not used in any Terraform's configuration [mfaAssume]- contains the data which was generated by the aws sts command

Then my terragrunt.hcl looked like below which the most important variable is profile = "mfaAssume"

remote_state {
  backend = "s3"
  config = {
    bucket         = "foo-terraform-state"
    dynamodb_table = "foo-terraform-state"
    encrypt        = true
    key            = "terraform.tfstate"
    profile        = "mfaAssume"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "eu-central-1"
  }
}

This configuration worked properly! I did not have any messages from Terragrunt like

Error finding AWS credentials in file '~/.aws/credentials' (did you set the correct file name and/or profile?): NoCredentialProviders: no valid providers in chain. Deprecated

or

Error finding AWS credentials (did you set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables?): NoCredentialProviders: no valid providers in chain. Deprecated.

TEST

To be sure that it is all about the default profile I testes it by:

  1. renaming [default] to some other e.g. [myProfile] - DID NOT WORK
  2. leave the [default] as an empty profile and put its previous credentials to other e.g. [myProfile] - DID NOT WORK

During the tests my variable profile had the same value mfaAssume.

CONCLUSION It seems that for Terragrunt somehow the profile [default] must be present in case we want to use some other. I do not know whether it is because of some relation between [mfaAssume] and [default] or some other reason...

brikis98 commented 4 years ago

Thx for doing the research @khdevel. Under the hood, Terragrunt (and Terraform) both use the AWS Go SDK. Perhaps we're somehow using it wrong, but I'm not sure why it would be have special or different... Note that the AWS Go SDK does have some env vars you may need to set, such as setting AWS_SDK_LOAD_CONFIG to true. Not sure if those make any difference?

estebanz01 commented 3 years ago

👋 I'm having this problem too.

Terragrunt is not picking up the specified AWS profile that lives in ~/.aws/credentials. The problem i'm seeing is that terragrunt is using the IAM Role assigned to my EC2 dev instance and I need to run terragrunt in another AWS Account with the access key/secrets specified in another profile, not default (which is empty). Any ideas?

domenjesenovec commented 3 years ago

I can confirm adding [default] profile in ~/.aws/credentials solved the problem for me, thanks @khdevel

eliz-ol commented 3 years ago

I can confirm the confirmation from @domenjesenovec. It appears that the S3 backend requires the default profile in the plain text (ini style) %USERPROFILE%.aws\credentials file. I tried it using a default profile in the the encrypted (SDK style) credentials file at %USERPROFILE%\AppData\Local\AWSToolkit\RegisteredAccounts.json and it failed with the same error mode.

The AWS Go SDK documentation seems to indicate that they expect the plain text file (about halfway down from the link that @brikis98 provided, under the header "Shared Credentials File"). It seems to me like the Go SDK makes a bunch of assumptions that aren't necessarily reasonable.

artemkozlenkov commented 3 years ago

similar issue, easily being resolved by doing exactly what the msg asks to do, set glob ENVs via export like export AWS_ACCESS_KEY_ID=xxx export AWS_SECRET_ACCESS_KEY=xxx works :+1:

Kiran01bm commented 3 years ago

So my use-case was similar to the one's mentioned above:

  1. Remote state (bucket and table) in a central-type AWS account (more like mgmt/build/deploy account).
  2. Resources to be provisioned in another AWS account.

per @khdevel's update and aws sdk doco link from @brikis98 - The following worked for me.

  1. $HOME/.aws/credentials to have the [default] section with creds required to operate on Remote State Bucket and Table and profile = "default" in the s3 remote_state block.
  2. Pointing to AWS account where the resources needs to be created via AWS_PROFILE
    export AWS_PROFILE=XXX-CREATE-AWS-RESOURCES-HERE-XXXX 

The following does NOT work:

  1. $HOME/.aws/credentials to have the [default] section with creds required to operate on Remote State Bucket and Table and profile = "default" in the s3 remote_state block.
  2. Pointing to AWS account where the resources needs to be created via glob env vars
    export AWS_ACCESS_KEY_ID=XXXX 
    export AWS_SECRET_ACCESS_KEY=XXXX
    export AWS_SESSION_TOKEN=XXXX

    Note: Ends-up with a Error refreshing state: AccessDenied: Access Denied (because: The SDK ends-up using creds from Step 2 during session creation/signing requests to access Remote State Bucket and Table)

jav-12 commented 2 years ago

Is there a solution to not use the default profile name?

The issue is present when you use the assume_role feature with Terragrunt. Using with single account, there is no problem...

DEBU[0000] Assuming IAM role arn:aws:iam::0000000000000:role/pipeline-users with a session duration of 3600 seconds. ERRO[0006] Error finding AWS credentials (did you set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables?): NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors ERRO[0006] Unable to determine underlying exit code, so Terragrunt will exit with error code 1

lorengordon commented 1 year ago

I've sometimes gotten this error in run-all scenarios with high parallelization, where it seems the backend initialization hits some kind of API rate limit that unfortunately does not return a 429 (depending on how you setup the credential, I'm using an instance profile so I think the error ultimately is coming from the ec2 metadata service). I've worked around it by customizing the terragrunt retries:

retryable_errors = [
  "(?s).*Failed to load state.*tcp.*timeout.*",
  "(?s).*Failed to load backend.*TLS handshake timeout.*",
  "(?s).*Creating metric alarm failed.*request to update this alarm is in progress.*",
  "(?s).*Error installing provider.*TLS handshake timeout.*",
  "(?s).*Error configuring the backend.*TLS handshake timeout.*",
  "(?s).*Error installing provider.*tcp.*timeout.*",
  "(?s).*Error installing provider.*tcp.*connection reset by peer.*",
  "NoSuchBucket: The specified bucket does not exist",
  "(?s).*Error creating SSM parameter: TooManyUpdates:.*",
  "(?s).*app.terraform.io.*: 429 Too Many Requests.*",
  "(?s).*ssh_exchange_identification.*Connection closed by remote host.*",
  "(?s).*Client\\.Timeout exceeded while awaiting headers.*",
  "(?s).*Could not download module.*The requested URL returned error: 429.*",
  "(?s).*Error: NoCredentialProviders: no valid providers in chain.*",
]

The last one appears to catch this error. The rest are the default retryable errors that terragrunt comes with.

tamsky commented 1 year ago

Is this the same error?

ERRO[0001] Error finding AWS credentials (did you set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables?): MissingEndpoint: 'Endpoint' configuration is required for this service
ERRO[0001] Unable to determine underlying exit code, so Terragrunt will exit with error code 1

If so, it was caused (for me) by a lack of a setting for AWS_REGION (or AWS_DEFAULT_REGION)

d4n13lbc commented 10 months ago

In our case commenting/removing the setup of the default profile from here did the trick:

terragrunt.hcl in the root of the repository

# Configure Terragrunt to automatically store tfstate files in an S3 bucket
remote_state {
  backend = "s3"
  config = {
    # profile        = local.aws_profile  <- Here!
    ...
    }
  }
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
}
AdamDomagalsky commented 4 months ago

Same as @d4n13lbc Even though profile exists, it should not be mentioned in the config