hashicorp / terraform-provider-boundary

Manage Boundary's identity-based access controls for resources provisioned with Terraform. This provider is maintained internally by the HashiCorp Boundary team.
https://registry.terraform.io/providers/hashicorp/boundary/latest
Mozilla Public License 2.0
100 stars 55 forks source link

Boundary provider needs assume_role #62

Open jorhett opened 3 years ago

jorhett commented 3 years ago

Terraform Version

Any

Affected Resource(s)

None

Terraform Configuration Files

provider "boundary" {
  addr             = "https://${data.terraform_remote_state.boundary.outputs.api_url}:9200/"
  recovery_kms_hcl = <<EOT
kms "awskms" {
    purpose    = "recovery"
    key_id     = "global_root"
    region     = "${var.region}"
    kms_key_id = "${data.terraform_remote_state.boundary.outputs.kms_recovery_key_id}"
}
EOT
}

Expected Behavior

What should have happened?

Need to be able to assume a different role for access to KMS resource.

Actual Behavior

What actually happened?

Error: error reading wrappers from "recovery_kms_hcl": Error configuring kms: error fetching AWS KMS wrapping key information: NotFoundException: Key 'arn:aws:kms:us-east-2:0000000000:key/abcd1234-12ba-34dc-56fe-98765fedcba' does not exist

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. Have a configuration that utilizes role assumption to access specific resources, with an aws provider configuration as documented at https://registry.terraform.io/providers/hashicorp/aws/latest/docs#assume-role
  2. terraform plan

Important Factoids

When using the Terraform provider you are running from a workstation or CI system, not from the instance node. While it would be possible to create a CI node in EC2 which has a god-like instance profile, this is a lot less secure than specific role assumption rights given to specific jobs.

While it can be made to work by setting environment variables, the AWS provider with role assumption will be overridden by the environment variables, as documented at https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication, thus breaking all AWS resources in the same plan.

The AWS provider is based on the same SDK, so it would have the same abilities if you added the same attributes to the Boundary provider schema.

kms "awskms" {
  purpose = "recovery"
  region  = "us-east-1"
  assume_role {
    role_arn = "arn:aws:iam::0000000000:role/iam-identity-foobar"
  }
}

References

Too many to list in hashicorp/terraform-provider-aws

malnick commented 3 years ago

@jorhett I want to make sure you're not conflating the Boundary configuration HCL with Terraform HCL. In your comment below, it sounds like you're comparing the AWS Terraform provider with the Boundary configuration HCL:

The AWS provider is based on the same SDK, so it would have the same abilities if you added the same attributes to the Boundary provider schema.

kms "awskms" {
  purpose = "recovery"
  region  = "us-east-1"
  assume_role {
    role_arn = "arn:aws:iam::0000000000:role/iam-identity-foobar"
  }
}

Though the Boundary configuration uses HCL, it's not Terraform.

Secondly, I'm not following the logic here:

When using the Terraform provider you are running from a workstation or CI system, not from the instance node. While it would be possible to create a CI node in EC2 which has a god-like instance profile, this is a lot less secure than specific role assumption rights given to specific jobs.

An instance profile is actually assuming a role under the hood, the trust policy associated with the instance profile allows this behavior. Is there some reason you can't reduce the privileges of the role associated with the instance profile?

jorhett commented 3 years ago

@malnick Everything I wrote is about your Terraform provider for Boundary. Nothing in here is anything other than Terraform.

An instance profile is actually assuming a role under the hood, the trust policy associated with the instance profile allows this behavior. Is there some reason you can't reduce the privileges of the role associated with the instance profile?

Dozens of reasons, all of which are covered extensively in the discussions that led to the role assumption being added to Terraform's AWS provider. TL;DR this would require running one node just to do deployments for each and every Boundary cluster, which would be 16x nodes ATM and probably double that soon. It would also require sequencing the CI job across dozens of nodes since different parts of the job use different profiles.

Even just writing the CI logic to ensure the right node with the right AWS profile is selected for each job would be seriously non-trivial and require CI node deployment management that we don't have today... because the Terraform AWS role assumption code was designed exactly to solve this problem

jefferai commented 3 years ago

Note that the provider doesn't actually construct any of the KMS stuff on its own. It just passes the HCL to a Boundary helper which itself just formats it and passes it to https://github.com/hashicorp/go-kms-wrapping. So this issue will not result in any actionable work within this repo (other than pulling in updated deps).

andybritton commented 3 years ago

I'm also facing the same issue as Jo, with multi account AWS and SSO you assume a role in order to not have to create user accounts in X number of AWS accounts. When you supply the HCL for awskms it looks for that key using the AWS_PROFILE you have set so in my case it's looking for a kms arn in a completely different account to where boundary is running which has the kms lookup via an instance profile. Adding the assume_role to the provider to negate this would help solve this I believe. I've also had this using the CLI so in either method per the docs you won't be able to authenticate unless you're a user in the same account where boundary lives.

jorhett commented 3 years ago

So this issue will not result in any actionable work within this repo (other than pulling in updated deps).

It's pretty hard to understand what you mean by this response. I want to interpret it to mean that you grasp we just need to pass the assume_role option to the underlying providers, which yes that is my thought as well. But someone else who read it thinks it was a rejection statement...? Can you clarify what you mean @jefferai ?

Is it a request that I open a related issue for parsing the additional block in another repo?

jefferai commented 3 years ago

What I meant is that the HCL is parsed by a different library that ultimately seeds that into the configuration at https://www.github.com/hashicorp/go-kms-wrapping/tree/master/wrappers/awskms/awskms.go

So the kms configuration is opaque to the provider. The only thing the provider could do is update deps when it's fixed elsewhere.

I don't know much about AWS authentication, however if someone that does wants to tackle this, the file I linked above is where support would need to be added.

samuelarogbonlo commented 3 years ago

Hello @jefferai @jorhett I have similar error, how do I handle it? But mine happens when I try terraform apply, the error is shown with this Error: error reading wrappers from "recovery_kms_hcl": Error configuring kms: error fetching AWS KMS wrapping key information: NotFoundException: Key 'arn:aws:kms:us-west-2:977076650775:key/e44f0d8f-ba91-4d3a-b8a2-c8f6cc8b515b' does not exist

I have checked my AWS, there is nothing like that but why the error?

malnick commented 3 years ago

It's been a while since I've done AWS administration, and maybe I'm not reading this correctly, but isn't it possible to assume the role of another account through the instance profile itself? I believe you can do this by configuring the trust policy and IAM instance profile per this guide: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html

Once you do that, the provider should get a access key, id, and token through the loopback interface on the host where Terraform is being ran.

On the other hand, if you're running the provider from a host outside of EC2, the best workaround for this is using the AWS CLI to set the correct env vars using sts assume role before running TF. However, I'm pretty sure this should "just work" if you can run TF from inside EC2 using an instance profile as outlined in that guide.

samuelarogbonlo commented 3 years ago

Well for everyone having the issue, just confirm the IAM roles and you are good to go

jorhett commented 3 years ago

@jefferai Note that the provider doesn't actually construct any of the KMS stuff on its own. It just passes the HCL to a Boundary helper which itself just formats it and passes it to https://github.com/hashicorp/go-kms-wrapping

Yes, and the go-kms-wrapping supports aliased AWS providers. But your code doesn't pass that along. No upstream change is going to fix a limitation of what syntax your provider accepts.

@malnick but isn't it possible to assume the role of another account through the instance profile itself? ...snip... per this guide: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html

The way to assume a role in Terraform is to provide an aliased provider with the assumed role declared. Here's Hashicorp's documentation for doing this:

But the Boundary provider doesn't accept this: it requires the use of AWS_PROFILE environment variable, which overrides the assumed role definition, or the placement of the role to be used in the main aws provider configuration. This makes it impossible to create clusters of workers in different accounts, since the same role must be used for every Boundary resource.

This is exactly what I'm asking you to support-- start with Hashicorp's docs on the AWS provider setup for role assumption, and then make it possible to use that configuration in your provider.

ToROxI commented 3 years ago

Any updates on this issue? CC: @malnick

oboukili commented 2 years ago

The same issue applies to GCP serviceaccount impersonation with gcpckms.

jorhett commented 2 years ago

Seems like this was added https://www.boundaryproject.io/docs/configuration/kms/awskms#role_arn

...but there's no explanation of what value should be in web_identity_token_file and it doesn't appear to work. I've posted details in this thread https://discuss.hashicorp.com/t/how-to-assume-role-to-read-kms-keys/19506/9

jorhett commented 2 years ago

mwalling poked into the code and figured out that the role assumption assumes a Saml web assignment https://github.com/hashicorp/go-secure-stdlib/pull/33

The SDK has a role assumption provider (which the web provider is actually a child class of) but there was no code to implement it. I created hashicorp/go-secure-stdlib#33 to show how easy the implementation is, and hopefully convince @jefferai @malnick etc to implement it

mwalling commented 2 years ago

@jorhett thanks for running with that! (@cf-mwalling is my sock puppet work account)

yongzhang commented 9 months ago

Any updates on this? I think most of people run terraform out of EC2 instances.

jefferai commented 9 months ago

There was a significant update to the underlying awsutil lib used; I'm not sure if that got integrated into the provider yet. @psekar do you have any further state on this?

aiqueneldar commented 9 months ago

I've come across this as well. We're having one AWS mgmt account where you log in. But store all resources in other separate accounts per environment, which we access by assuming roles in all the different accounts.

And I'm having this same problem where the Boundary Terraform provider won't do a correct assume role to be able to access the awskms key stored in a separate account to be able to provision boundary.

I've tried having both the KMS key in the same account as the boundary controller I'm trying to provision, but it still isn't able to retrieve the KMS restore key so that I'm able to configure the first boundary auth method.

The way we run Terraform is also through our workstations rather than on EC2 machines. I've seen in comments and in code the mention that it should now be able to use this through EC2. But that is not how we have our environment setup.

So I am also very curios on what the the timescale and roadmap looks like to get this fixed and have a similar functionality to the AWS provider with assume_role that works really well for that provider, in the boundary provider.