weaveworks / launcher

Weave Cloud Launcher
Apache License 2.0
10 stars 13 forks source link

RBAC error: "weave-agent" is forbidden: attempt to grant extra privileges #117

Closed dlespiau closed 6 years ago

dlespiau commented 6 years ago

This error happens with GKE, when users try to run the command and, for some reason, the authenticated user doesn't have a cluster-admin role bound.

Users are supposed to call the command with --gke so such a clusterrolebinding is created. If unable to do so, we error out. Current hypothesis, users either don't select the kubernetes/GKE environment or don't copy the --gke option.

I could reproduce this on sock-shop, deleting my admin binding:

kubectl -n weave delete clusterrolebinding cluster-admin-damien
kubectl apply -f agent.yaml
Error from server (Forbidden): clusterroles.rbac.authorization.k8s.io "weave-agent" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["*"], APIGroups:["*"], Verbs:["*"]} PolicyRule{NonResourceURLs:["*"], Verbs:["*"]}] user=&{damien@weave.works  [system:authenticated] map[]} ownerrules=[PolicyRule{Resources:["selfsubjectaccessreviews"], APIGroups:["authorization.k8s.io"], Verbs:["create"]} PolicyRule{Resources:["selfsubjectrulesreviews"], APIGroups:["authorization.k8s.io"], Verbs:["create"]} PolicyRule{NonResourceURLs:["/api" "/api/*" "/apis" "/apis/*" "/healthz" "/swaggerapi" "/swaggerapi/*" "/version"], Verbs:["get"]} PolicyRule{NonResourceURLs:["/swagger-2.0.0.pb-v1"], Verbs:["get"]} PolicyRule{NonResourceURLs:["/swagger.json"], Verbs:["get"]}] ruleResolutionErrors=[]
dlespiau commented 6 years ago

Ah! Aaron pointed out that we don't throw a hard error when not being able to create an cluster-admin binding:

    if opts.GKE {
        err := createGKEClusterRoleBinding(kubectlClient)
        if err != nil {
            fmt.Fprintln(os.Stderr, "WARNING: For GKE installations, a cluster-admin clusterrolebinding is required.")
            fmt.Fprintf(os.Stderr, "Could not create clusterrolebinding: %s", err)
        }
    }

So, we then process with the bootstrapping which errors out. The real reason may have been that we don't find gcloud, don't create the admin role and so aren't able to create the weave-agent service account.

dlespiau commented 6 years ago

This is what the full curl invocation looks like:

$ curl -Ls https://get.weave.works | sh -s -- --token=<redacted> --gke
Downloading the Weave Cloud installer...  
Checking kubectl & kubernetes versions
Installing Weave Cloud agents on gke_sock-shop-staging_europe-west2-b_sock-shop at <redacted>
WARNING: For GKE installations, a cluster-admin clusterrolebinding is required.
Could not create clusterrolebinding: Could not find gcloud in PATH, please install it: https://cloud.google.com/sdk/docs/There was an error applying the agent: namespace "weave" configured
serviceaccount "weave-agent" configured
clusterrolebinding "weave-agent" configured
deployment "weave-agent" created
Error from server (Forbidden): clusterroles.rbac.authorization.k8s.io "weave-agent" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["*"], APIGroups:["*"], Verbs:["*"]} PolicyRule{NonResourceURLs:["*"], Verbs:["*"]}] user=&{damien@weave.works  [system:authenticated] map[]} ownerrules=[PolicyRule{Resources:["selfsubjectaccessreviews"], APIGroups:["authorization.k8s.io"], Verbs:["create"]} PolicyRule{Resources:["selfsubjectrulesreviews"], APIGroups:["authorization.k8s.io"], Verbs:["create"]} PolicyRule{NonResourceURLs:["/api" "/api/*" "/apis" "/apis/*" "/healthz" "/swaggerapi" "/swaggerapi/*" "/version"], Verbs:["get"]} PolicyRule{NonResourceURLs:["/swagger-2.0.0.pb-v1"], Verbs:["get"]} PolicyRule{NonResourceURLs:["/swagger.json"], Verbs:["get"]}] ruleResolutionErrors=[]
rade commented 6 years ago

We don't know whether the lack of gcloud is the issue. Can we improve our error reporting to include this warning?

dlespiau commented 6 years ago

That's an acceptable step!

rade commented 6 years ago

has the improved error reporting thrown up any clues yet?

dlespiau commented 6 years ago

Actually, only one bootstrap error with the added gke error in the last 5 days (2 days ago):

rade commented 6 years ago

I thought I fixed that in #140. Why are we still seeing this?

lilic commented 6 years ago

As agreed offline we should check to see the user has enough permissions to grant them cluster wide admin roles on GKE, if not we should error out https://github.com/weaveworks/launcher/blob/master/bootstrap/main.go#L109

Possibly check by running something like this:

gcloud container clusters get-credentials

If the problem was anything else but this, we should still continue...

lilic commented 6 years ago

The PR for suggesting to the user that something went wrong, was merged https://github.com/weaveworks/launcher/pull/186 Not sure if we can close this now, or is there anything else we can do here? Maybe give a better suggestion how the user can fix it?