hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.28k stars 1.72k forks source link

google_client_config data-source fails to issue token : `oauth2/google: invalid response from Secure Token Server: Post "https://sts.googleapis.com/v1/token": context canceled` #18774

Open alexsomesan opened 1 month ago

alexsomesan commented 1 month ago

Community Note

Terraform Version & Provider Version(s)

Terraform v1.9.2 on arm64

Affected Resource(s)

google_client_config data-source

Terraform Configuration

provider "google" {
#    credentials = "adc.json" # uncomment to trigger error
}

data "google_client_config" "current" {}

resource "local_file" "token" {
    filename = "google_token"
    content = data.google_client_config.current.access_token
}

Debug Output

No response

Expected Behavior

Apply should succeed and a token should be written to the google_token file in the working directory.

Actual Behavior

Planning fails with following errors:

data.google_client_config.current: Reading...
data.google_client_config.current: Read complete after 0s
local_file.token: Refreshing state... [id=83053c275a52dbd8f9b4a73c3d9e529e18523939]

Planning failed. Terraform encountered an error while generating this plan.

โ•ท
โ”‚ Error: Invalid Attribute Combination
โ”‚ 
โ”‚   with local_file.token,
โ”‚   on main.tf line 7, in resource "local_file" "token":
โ”‚    7: resource "local_file" "token" {
โ”‚ 
โ”‚ No attribute specified when one (and only one) of [content,sensitive_content,content_base64] is required
โ•ต
โ•ท
โ”‚ Error: Invalid Attribute Combination
โ”‚ 
โ”‚   with local_file.token,
โ”‚   on main.tf line 7, in resource "local_file" "token":
โ”‚    7: resource "local_file" "token" {
โ”‚ 
โ”‚ No attribute specified when one (and only one) of [content,content_base64,source] is required
โ•ต
โ•ท
โ”‚ Error: Invalid Attribute Combination
โ”‚ 
โ”‚   with local_file.token,
โ”‚   on main.tf line 7, in resource "local_file" "token":
โ”‚    7: resource "local_file" "token" {
โ”‚ 
โ”‚ No attribute specified when one (and only one) of [content,sensitive_content,source] is required
โ•ต
โ•ท
โ”‚ Error: Invalid Attribute Combination
โ”‚ 
โ”‚   with local_file.token,
โ”‚   on main.tf line 9, in resource "local_file" "token":
โ”‚    9:     content = data.google_client_config.current.access_token
โ”‚ 
โ”‚ No attribute specified when one (and only one) of [sensitive_content,content_base64,source] is required

Steps to reproduce

  1. Authenticate to GCP for ADC with gcloud auth application-default login
  2. terraform apply should succeed and produce a token
  3. Copy ADC creds to local file: cp ~/.config/gcloud/application_default_credentials.json adc.json
  4. Uncomment the credentials attribute on the provider block
  5. terraform apply should fail with above mentioned error

Important Factoids

This doesn't seem to be specific to ADC credentials. I was able to reproduce with workload identity credentials as well.

References

No response

BBBmau commented 1 month ago

Copy ADC creds to local file: cp ~/.config/gcloud/application_default_credentials.json adc.json

The reason for the empty access_token from google_client_config is due to setting the path for credentials with application_default_credentials.json. Ideally you would use a service_account_key.json file, this is mentioned in the docs here

It would be nice to include an error output to inform the user that a service_account_key is not being used resulting in a google_access_config.current.access_token returning as empty.

alexsomesan commented 1 month ago

OK, I guess ADC isn't the optimal reproduction approach. However, this issue initially occurred when using workload identity federation, which is indeed based on service accounts.

The value of credentials in that case looks like this:

    credentials = jsonencode(
      {
        "type": "external_account",
        "audience": var.gcp_audience,
        "subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
        "token_url": "https://sts.googleapis.com/v1/token",
        "credential_source": {
          "file": var.identity_token_gcp
        },
        "service_account_impersonation_url": format("https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/%s:generateAccessToken", var.gcp_service_account_email)
      }
    )

These credentials do work correctly for creating resources such as a GKE cluster, but not for issuing the access_token via the mentioned datasource.

melinath commented 1 month ago

I've looked into this some more.

Setting credentials to the application default credentials file (or a copy) works just fine for me. Using a service_account_key also works fine.

However, I can reproduce the error if I have set up impersonation of a service account AND the credentials I'm using don't actually have permissions to impersonate that service account. That results in an error like:

{
  "error": {
    "code": 403,
    "message": "Permission 'iam.serviceAccounts.getAccessToken' denied on resource (or it may not exist).",
    "status": "PERMISSION_DENIED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "IAM_PERMISSION_DENIED",
        "domain": "iam.googleapis.com",
        "metadata": {
          "permission": "iam.serviceAccounts.getAccessToken"
        }
      }
    ]
  }
}

This error will get triggered as part of google_client_config setting its access token but for some reason that is not actually being surfaced to users - instead, the access token is just left empty (resulting in the reported error.)

@alexsomesan can you double-check that the primary account the creds are for has permissions to impersonate the relevant service account? And/or test without impersonation? @SarahFrench this is a plugin-framework resource - any chance you happen to know offhand what might be going on here in terms of the error not being surfaced?

alexsomesan commented 1 month ago

I can double check the permissions of my impersonated service account tomorrow and report back.

SarahFrench commented 3 weeks ago

@SarahFrench this is a plugin-framework resource - any chance you happen to know offhand what might be going on here in terms of the error not being surfaced?

I just took a look into this:

Here's the relevant code in the data source:

    token, err := d.providerConfig.TokenSource.Token()
    if err != nil {
        diags.AddError("Error setting access_token", err.Error())
        return
    }
    data.AccessToken = types.StringValue(token.AccessToken)

The data source accesses the TokenSource (oauth2.TokenSource type) in the provider config. This token source is obtained, at the time of configuring the provider, from the oauth2/google.Credentials value that's returned from the function that processes all the auth-related provider config arguments.

So in the snippet above we don't control the implementation of TokenSource or Token(), and so the only opportunity for error handling is in the error handling shown.

It could be that there's an error being swallowed in the process of creating the oauth2/google.Credentials value, that the token source is derived from. The waters are muddied here by the fact that when the provider was muxed there was a parallel implementation of that logic (original SDK implementation of GetCredentials and the plugin-framework implemented GetCredentials). We'd get some clarity by seeing if the original problem persists if you use v4.59.0 of the Google provider; this version includes the muxing but the google_client_config data source is implemented with the SDK still, so if the problem is still present there we'll know the plugin framework isn't relevant.

See https://github.com/hashicorp/terraform-provider-google/issues/18774#issuecomment-2293519230


Separate thought, if service account impersonation is desired but that is only defined in the JSON of the credentials file I'm not sure if the provider can handle that? Maybe the impersonate_service_account field also needs to be set? I don't full understand it, but when we get the oauth2/google.Credentials object we set options that reflect the desire to impersonate a service account. I don't know if defining this in the credentials JSON is functionally equivalent to setting this provider level field or not

SarahFrench commented 3 weeks ago

I started playing with a debugger and found that an error is swallowed - it looks like it's being handled in the snippet above but it's not actually returned to TF core. See:https://github.com/GoogleCloudPlatform/magic-modules/pull/11470/commits/28434ad45845cdd02d92b4c8cf968f53f95076a2

This PR https://github.com/GoogleCloudPlatform/magic-modules/pull/11470 might close this issue

SarahFrench commented 2 weeks ago

After testing the provider with the fix in the PR above we found that the data source returned an error like:

โ•ท
โ”‚ Error: Error setting access_token
โ”‚ 
โ”‚   with data.google_client_config.current,
โ”‚   on gke.tf line 33, in data "google_client_config" "current":
โ”‚   33: data "google_client_config" "current" {
โ”‚ 
โ”‚ oauth2/google: unable to generate access token: Post
โ”‚ "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/my-project.iam.gserviceaccount.com:generateAccessToken": oauth2/google: invalid
โ”‚ response from Secure Token Server: Post "https://sts.googleapis.com/v1/token": context canceled
โ•ต

This is reminiscent of data.google_dns_managed_zone always fails with "Post "https://oauth2.googleapis.com/token": context canceled" ยท Issue #16832 ยท hashicorp/terraform-provider-google which was solved by reverting the data source to SDK again.

This issue appears to be another instance of the bad muxing affecting some methods of auth.

I'll create a meta-issue seeing as I'm working on this currently.