vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
16.98k stars 1.46k forks source link

Include support for GCP authentication via workload identity federation #16387

Open cahartma opened 1 year ago

cahartma commented 1 year ago

A note for the community

Use Cases

Traditionally, applications running outside Google Cloud can use static service account keys to access Google Cloud resources. In Vector, most GCP related sinks (in particular for us the 'gcp_stackdriver_logs'), use a 'credentials_path' config where we provide the path to the 'service_account' json file, containing the static keys. However, service account keys are powerful credentials and can present a security risk if they are not managed correctly.

Workload identity enables you to assign distinct, fine-grained identities and authorization. Workload identity is the recommended way for applications to access AWS and Google Cloud services.

Customer's want to be able to authenticate GCP sinks using service account impersonation (aka 'external_accounts'), rather than long-lived credentials, in the same way AWS sinks have the ability to authenticate using a 'role_arn' and 'web_identity_token'.

https://cloud.google.com/iam/docs/workload-identity-federation

Attempted Solutions

Current blocker is a generic error: "Invalid GCP Credentials", when using a type: "external_account" for the credentials file in vector.

Proposal

Without Workload Identity, the type of the credentials file is service_account. These credentials include a private RSA key, in the private_key field, to be able to authenticate to gcp. This private key needs to be kept secure and is most often not rotated.

{
   "type": "service_account",
   "project_id": "test-project",
   "private_key_id": "123absc5678993dabd942adf0ff0812c789f",
   "private_key": "<private_key>",
   "client_email": "test-email@test-project.iam.gserviceaccount.com",
   "client_id": "1006844567890123456789",
   "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   "token_uri": "https://oauth2.googleapis.com/token",
   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/test-email@test-project.iam.gserviceaccount.com"
}

With Workload Identity, the type of the credentials is external_account, and the audience is the target audience which is the workload identity provider. The service_account_impersonation_url key contains the resource url of the service account that can be impersonated with these credentials. credentials_source.file is the path to the oidc token, which is exchanged for a google access token. The oidc token is then rotated every hour and thus credentials are short lived.

{
   "type": "external_account",
   "audience": "//iam.googleapis.com/projects/123456789/locations/global/workloadIdentityPools/test-pool/providers/test-provider",
   "subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
   "token_url": "https://sts.googleapis.com/v1/token",
   "service_account_impersonation_url": "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/test-service-account@test-project.iam.gserviceaccount.com:generateAccessToken",
   "credential_source": {
      "file": "/path/to/oidc/token",
      "format": {
         "type": "text"
      }
   }
}

References

Recent GCP work for another related type of auth "GKE workload identity":

Version

0.26.0

just1900 commented 3 months ago

Any Updates?

dronenb commented 2 months ago

I added an issue to the upstream library that is being used. However, that particular library does not seem particularly well maintained. Perhaps it would be best to switch to a Google auth library that is more actively maintained, such as google-cloud-auth, which already has support for type: external_account (behind a feature gate, presently).

jszwedko commented 2 months ago

Thanks @dronenb . I do see that goauth seems to be less maintained these days. I wish GCP would just create one that they maintain 😅 I think we'd be open to swapping the library but would want a bit of a more thorough proposal comparing them (and any other options). This could be in the form of a GitHub issue or an RFC.

jszwedko commented 2 months ago

https://github.com/mozilla-services/google-cloud-rust seems like it has some more organizational support behind it (Mozilla, Google Cloud, Ferrous Systems, and IGNW).