google-github-actions / get-gke-credentials

A GitHub Action that configure authentication to a GKE cluster.
https://cloud.google.com/gke
Apache License 2.0
100 stars 41 forks source link

Failed with: required "container.clusters.get" permission(s) #309

Closed thardy closed 1 month ago

thardy commented 1 month ago

TL;DR

The action is failing with the following error... "google-github-actions/get-gke-credentials failed with: required "container.clusters.get" permission(s)"

Expected behavior

I expected the action to successfully get credentials and then allow me to run a kubectl command against my gke cluster.

Observed behavior

I receive the error "google-github-actions/get-gke-credentials failed with: required "container.clusters.get" permission(s)"

Action YAML

name: deploy-k8s-manifests

on:
  push:
    branches:
      - dev
    paths:
      - 'k8s/**'

jobs:
  deploy:
    runs-on: ubuntu-latest

    # Add "id-token" with the intended permissions.
    permissions:
      contents: 'read'
      id-token: 'write'

    steps:
      - name: Get code
        uses: actions/checkout@v4

      - name: Authenticate with GCP
        id: 'auth'
        uses: google-github-actions/auth@v2
        with:
          project_id: 'my-project'
          workload_identity_provider: 'projects/297600345299/locations/global/workloadIdentityPools/github/providers/my-provider'

      - name: Get GKE credentials
        id: 'get-credentials'
        uses: google-github-actions/get-gke-credentials@v2
        with:
          cluster_name: 'preprod'
          location: 'us-central1'

      - name: Apply k8s manifests in GCP
        run: kubectl apply -f k8s

Log output

Authenticate with GCP
Run google-github-actions/auth@v2
Created credentials file at "/home/runner/work/my-project/my-project/gha-creds-c9c4d62169250d9a.json"

Get GKE credentials
Run google-github-actions/get-gke-credentials@v2
Error: google-github-actions/get-gke-credentials failed with: required "container.clusters.get" permission(s) for "projects/my-project/locations/us-central1/clusters/preprod".

Additional information

I'm using the “Direct Workload Identity Federation” option as described by the google-github-actions/auth action. I also created my Workload Identity Pool and Provider according to their instructions. All of the help I'm reading talks about service accounts, but the auth action is clear that the "Direct Workload Identity Federation" option does not require a service account.

Any help will be greatly appreciated.

sethvargo commented 1 month ago

You need to grant the WIF pool roles/container.clusterViewer:

gcloud projects add-iam-policy-binding my-project \
  --member "principalSet://..." \
  --role "roles/container.clusterViewer"

You can use conditions to limit access to a particular cluster.

thardy commented 1 month ago

Tried to run the following...

gcloud projects add-iam-policy-binding my-project --member "principalSet://..." --role "roles/container.clusterViewer"

and received the following error...

ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: The member principalSet://... is of an unknown type. Please set a valid type prefix for the member.
sethvargo commented 1 month ago

You need to use the actual value, "..." was a placeholder for your WIF provider: https://cloud.google.com/iam/docs/workload-identity-federation

thardy commented 1 month ago

Ok, I finally got this to work. The following link proved to be the most helpful, filling in the gaps I was missing about assigning IAM roles to WIF stuff - https://cloud.google.com/iam/docs/workload-identity-federation-with-deployment-pipelines#github-actions

For future googlers... In order to even run "kubectl get pods" as shown in the readme, I needed to add both "roles/container.clusterViewer" and more... I added "roles/container.admin", (which was probably overkill) via the following two commands...

gcloud projects add-iam-policy-binding my-project \ --role "roles/container.clusterViewer" --member "principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/attribute.repository/my-repo"

gcloud projects add-iam-policy-binding my-project \ --role "roles/container.admin" --member "principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/attribute.repository/my-repo"

This is what I found the most confusing. To help future readers, member can be a user's email (e.g. -- member "user: EMAIL"), which is super easy to understand assigning roles to. In the specific case here, however, member is what I'd call a "WIF scenario". Here's how I think of the above gcloud commands, and not understanding this was my biggest mind-block - "When a call comes through via the specified Workload Identity Federation Pool, AND the token Github Actions sends along with it contains a "repository" property with a value of 'my-repo', give that command the following role." The fact the repo name is associated with "attribute.repository" comes from the "Attribute Mapping" on the WIF Provider... image

The documentation is severely lacking. Here are some areas that need to be addressed... Authorization There are a few ways to authenticate this action. A service account will be needed with at least the following roles: Kubernetes Engine Cluster Viewer (roles/container.clusterViewer)

NOPE. You definitely do NOT need a service account if you are using "Direct Workload Identity Federation". This constant flakiness between this project's reference needing and not needing of service accounts (in the auth project too) was INCREDIBLY confusing. For someone who was brand new to all of this, with zero foundation, I was completely lost on whether I did or did not need a service account.

The next section... Via google-github-actions/auth Use google-github-actions/auth to authenticate the action. You can use Workload Identity Federation or traditional Service Account Key JSON authentication. by specifying the credentials input. This Action supports both the recommended Workload Identity Federation based authentication and the traditional Service Account Key JSON based auth.

See usage for more details.

At a minimum, I would have loved to see the minimum requirements to follow the example given at the top of this very readme - the minimum required to run "kubectl get pods". roles/container.clusterViewer is NOT enough. I also added roles/container.admin, which was probably way overkill, but it worked. I would love to know the minimum, or perhaps a best-practice or even a link to some guidance on the minimums to run certain kubectl commands. Also, the docs over on the google-github-actions/auth were not clear enough on how to actually add these IAMs roles to a WIF pool. The entire concept is confusing because you're not actually adding IAM roles to the WIF Pool or even the WIF provider, you're adding IAM roles to (I don't know what to call this) a "WIF scenario?" that comes through the WIF pool. It's the whole SUBJECT, GROUP, or ATTRIBUTE thing <-- super confusing. I spent half of my time wondering what part of this was a service account.

Now I like to think of the IAM roles assigned to the members above as, "when the pool with the specified id is used, and the token from GitHub Actions has the following attribute name/value, respond with credentials containing the following role(s)." That concept was very difficult for me to grasp until I completely read through - https://cloud.google.com/iam/docs/workload-identity-federation-with-deployment-pipelines#github-actions. I didn't see that link referenced anywhere in either readme, and I now consider it a mandatory foundation for all of this stuff. The IAM role assignment is is not explained clearly enough in the auth readme, and nowhere is there an end-to-end example of ALL of the steps you need to run even the simple "kubectl get pods".

I really appreciate the work you guys have done, but for a noob like myself, your docs need a lot of improvements.

thardy commented 1 month ago

I spent some time on an answer to my own stackoverflow question here - https://stackoverflow.com/questions/78670276/google-github-actions-get-gke-credentials-failed-with-required-container-clust/78674956#78674956

I'm sure someone can come up with a better way to describe it, but my biggest mind block was "what EXACTLY am I assigning a role to with this WIF stuff"? When someone suggests, "you need to grant the WIF pool...", know that I had no clue how to use the SUBJECT, GROUP, ATTRIBUTE stuff to declare a member. Also, there is no way to assign a role to a WIF pool or a provider, and until you get past that, you are completely stuck. You have to grant access to a "resource" "By subject", "By group", or "By attribute" as described here - https://cloud.google.com/iam/docs/workload-identity-federation-with-deployment-pipelines#authenticate

image