hashicorp / terraform-provider-vault

Terraform Vault provider
https://www.terraform.io/docs/providers/vault/
Mozilla Public License 2.0
458 stars 538 forks source link

[Bug]: vault_azure_access_credentials with validate_creds provides bad credentials intermittently #2079

Open F21 opened 10 months ago

F21 commented 10 months ago

Terraform Core Version

1.6.3

Terraform Vault Provider Version

3.21.0

Vault Server Version

1.15.0

Affected Resource(s)

Expected Behavior

The credentials returned by vault_azure_access_credentials should be guranteed to be working if validate_creds is set to `true.

Actual Behavior

The credentials fail intermittently and we often need to retry the plan or apply to get it to work. It fails interminenttly when used with the azuread and azurerm providers.

Relevant Error/Panic Output Snippet

╷
│ Error: building client: unable to obtain access token: clientCredentialsToken: received HTTP status 401 with response: {"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID, for a secret added to app 'REDACTED'. Trace ID: REDACTED Correlation ID: REDACTED Timestamp: 2023-11-06 07:47:49Z","error_codes":[7000215],"timestamp":"2023-11-06 07:47:49Z","trace_id":"REDACTED","correlation_id":"REDACTED","error_uri":"https://login.microsoftonline.com/error?code=7000215"}
│
│   with provider["registry.terraform.io/hashicorp/azuread"].prod1,
│   on main.tf line 84, in provider "azuread":
│   84: provider "azuread" {

Terraform Configuration Files

data "vault_azure_access_credentials" "prod1" {
  role           = "prod1"
  backend        = "azure-prod1"
  validate_creds = true
}

provider "azuread" {
  alias         = "prod1"
  client_id     = data.vault_azure_access_credentials.prod1.client_id
  client_secret = data.vault_azure_access_credentials.prod1.client_secret
  tenant_id     = "REDACTED"
}

Steps to Reproduce

  1. Run terraform apply multiple times until it fails due to bad errors.
  2. Run terraform apply until it succeeds.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

Nathan8575 commented 10 months ago

Also getting this issue but with different versions. Running a TF version of 1.5.7 and a provider version of 3.21 it works fine. However when I switch the provider version to 3.22 it returns the creds intermittently and they fail.

Edit: Actually maybe my issue is slightly different but related.

I had the following set:

data "vault_azure_access_credentials" "aad" {
  backend                   = var.vault_backend
  role                      = var.vault_role
  validate_creds            = true
  num_sequential_successes  = 10
  num_seconds_between_tests = 10
}

Prior to provider version 3.22 it would allow time for consistency as follows: image

But in version 3.22 it just returns them immediately without any pause so they fail as not enough time has passed. So validation is not working as it should.

fairclothjm commented 10 months ago

Unfortunately, this is a known issue with Azure since it is eventually consistent. We are looking into ways of solving this but for now we have no way of ensuring the credentials are propagated across all Azure data centers.

fairclothjm commented 10 months ago

We have observed that service principal credentials propagate throughout the Azure data centers faster than application credentials, which leads to less delays and consistency issues. If possible a workaround would be to always use dynamic service principles i.e. don't provide application_object_id but instead use azure_roles when creating the Role in Vault.

F21 commented 10 months ago

We're currently using pre-created service principals, because dynamic service principals don't work with API calls for Azure AD (only Azure RM).

References: