hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.11k stars 4.21k forks source link

Vault STS region not respected for IAM auth login #15088

Open rcousens opened 2 years ago

rcousens commented 2 years ago

Describe the bug No matter what configuration is provided, I am incapable of getting vault server via vault CLI to use an STS client for an alternative region that is not 'us-east-1' when doing an IAM auth login

To Reproduce Steps to reproduce the behavior:

  1. Set up a vault server in AWS EC2 that has an appropriate instance profile with permissions as specified in the IAM AWS auth documentation, turn on debug logging, init, unseal etc. Set environment variables for server with AWS_REGION and AWS_DEFAULT_REGION as ap-southeast-2
  2. Use vault CLI to connect to said server and execute following commands in local CLI environment
  3. Run vault auth enable aws
  4. Run vault write auth/aws/config/client sts_endpoint=https://sts.ap-southeast-2.amazonaws.com sts_region=ap-southeast-2
  5. Run vault write auth/aws/role/test auth_type=iam bound_iam_principal_arn="arn:aws:sts::xxxxxxxx:assumed-role/SSO_Admin_Role/*"
  6. Get creds (AWS_SECRET_ACCESS_KEY etc) for the above role in your local environment where the CLI is being used, set AWS_REGION and AWS_DEFAULT_REGION to same as the sts_region
  7. Run vault login -token-only -method=aws role=test region=ap-southeast-2, CLI client hangs until context deadline exceeded
  8. Check server logs, see: vault[5370]: 2022-04-19T04:43:28.014Z [DEBUG] auth.aws.auth_aws_0305af05: no cached client for region us-east-1 and stsRole

Expected behavior The region in the server debug logs for the STS client should be ap-southeast-2, us-east-1 is not valid in my environment (no net access)

Environment: Server version: 1.10.0 CLI version: 1.10.0 Server OS: Ubuntu AMD64

Vault server configuration file(s):

ui = true
seal "awskms" {
  kms_key_id = "xxxxxx"
  region     = "ap-southeast-2"

}

listener "tcp" {
  address         = "0.0.0.0:8200"
  cluster_address = "0.0.0.0:8201"
  tls_cert_file   = "/opt/vault/tls/vault.crt"
  tls_key_file    = "/opt/vault/tls/vault.key"
}

storage "dynamodb" {
  ha_enabled = "true"
  region = "ap-southeast-2"
  table  = "vault-cluster-data"
}

cluster_addr  = "https://172.19.64.133:8201"

log_level = "Debug"
region = "ap-southeast-2" # pretty sure this isn't considered

Additional context Strangely enough, when I specify a different region on the CLI, I get an error: vault login -token-only -method=aws role=test region="us-east-1"

Error authenticating: Error making API request.

URL: PUT https://127.0.0.1:8200/v1/auth/aws/login
Code: 400. Errors:

* error making upstream request: received error code 403 from STS: <ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
  <Error>
    <Type>Sender</Type>
    <Code>SignatureDoesNotMatch</Code>
    <Message>Credential should be scoped to a valid region. </Message>
  </Error>
  <RequestId>d5d82af0-e575-4b78-af77-fe2b971c9484</RequestId>
</ErrorResponse>
rcousens commented 2 years ago

I've done some further investigation by recreating the problem in a less restricted environment and it appears the log entry may be misleading and it might just log the region before the client is actually used and subsequently it's actually set by the sts_endpoint parameter etc. I still think the log message is wrong but probably not related to my problem.

I think my issue is more likely to do with the fact that I can't access other AWS APIs like iam.amazonaws.com within my environment (there's only an STS endpoint currently)

rcousens commented 2 years ago

I've also found out through my testing that Vault can't use the sts format of an ARN with assumed-role in it. Is this true? I have to specify the principal ARN of the actual role that is assumed?

The downside of this is I lose user information in an SSO federated login environment.

rcousens commented 2 years ago

So, I've answered my own questions here by doing a bit of investigation with tcpdump.

  1. Vault using AWS auth login also needs access to query iam.amazonaws.com to both add a role and then complete a login. IAM currently has no endpoint service in AWS, so Vault needs outbound internet network access, at least to iam.amazonaws.com
  2. The log message is indeed incorrect, while it doesn't have a cached client for us-east-1, it ultimately sets the client to sts_region and uses the sts_endpoint
  3. You can't use an assumed role ARN for the principal with AWS auth login, you must use the ARN of the originating role. However, turning on identity aliases to full_arn, and then adding token metadata such that the client_arn is logged reveals the originating SSO user in audit logs, i.e.
vault write auth/aws/config/identity iam_alias=full_arn iam_metadata=account_id,auth_type,canonical_arn,client_arn

So I think there is a bug here re the logs, it should read the sts_region first and try and locate a cached client against the specified region instead of using the default us-east-1 and then overriding the region. At least that's my naive take?

aeekayy commented 2 years ago

@rcousens It looks like the we need to add support for regional STS endpoints to Vault. While debugging, I found this comment which states that STS endpoints are global. This is old since regional STS endpoints were released by AWS in 2019.

I can take a look at addressing this. Due to the assumption that we're using global STS, the region argument doesn't make it the method that makes the call to AWS.