kubernetes / kubectl

Issue tracker and mirror of kubectl code
Apache License 2.0
2.89k stars 924 forks source link

`kubectl scale` fails with cryptic error if it can't fetch parent resource #1666

Closed ahmetb closed 1 month ago

ahmetb commented 1 month ago

What happened:

I ran command:

$ kubectl scale nodepools my-pool --replicas=3
error: no objects passed to scale

But my user actually didn't have get nodepools permission.

This error is very cryptic and prompted me to try things like specifying resource as RESOURCE/NAME, RESOURCE NAME and move --replicas before positional args --but that wasn't the issue.

Only when I added -v=7, I realized I actually have a permission issue:

Response Body: {"kind":"Status","apiVersion":"v1","metadata":
{},"status":"Failure","message":"nodepools.example.com \"my-pool\" 
is forbidden: User \"new-user\" cannot get resource \"nodepools\" 
in API group \"example.com\" at the cluster scope","reason":"Forbidden",
"details":{"name":"my-pool","group":"example.com","kind":"nodepools"},"code":403}

error: no objects passed to scale

What you expected to happen:

Tell me that I have the permission issue, don't mislead me that I didn't specify a resource.

How to reproduce it (as minimally and precisely as possible):

Just try kubectl scale on a resource that you don't have permission to get.

Anything else we need to know?:

Environment:

googs1025 commented 1 month ago

It seems that you want to use custom CRD resources to implement the scale. Does your CRD implement subresources similar to ? //+kubebuilder:subresource:scale:specpath=.spec.replicas,statuspath=.status.replicas,selectorpath=.status.selector FYI: https://github.com/kubernetes-sigs/kubebuilder/discussions/3486 https://book.kubebuilder.io/reference/generating-crd.html?highlight=scale#subresources

ahmetb commented 1 month ago

Yes I already have a scale subresource. kubectl still makes a GET query to the resource first. I will send a full request log.

googs1025 commented 1 month ago

Check the error you returned, it seems to be printed here in the source code. It seems that the resource was not obtained?

https://github.com/kubernetes/kubernetes/blob/7b28a115ba04651bc31aa1d7089abbd67ec5c067/staging/src/k8s.io/kubectl/pkg/cmd/scale/scale.go#L231-L233

ahmetb commented 1 month ago

I think the ContinueOnError() might be the problem here.

googs1025 commented 1 month ago

Why ContinueOnError is the problem? I am a little confused

ahmetb commented 1 month ago
  1. The expected behavior is that the kubectl makes a POST /scale call, right? It shouldn't need a GET call on the parent resource I think?
kubectl scale kpool ei4-k8s-intdemo-system --replicas=0 -v 7
...
GET https://127.0.0.1:63505/apis/compute.linkedin.com/v1alpha1/kubernetespools/ei4-k8s-intdemo-system
Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kubernetespools.compute.linkedin.com 
\"ei4-k8s-intdemo-system\" is forbidden: User \"new-user\" cannot get resource \"kubernetespools\" in API group \"compute.linkedin.com\" at the cluster 
scope","reason":"Forbidden","details":{"name":"ei4-k8s-intdemo-system","group":"compute.linkedin.com","kind":"kubernetespools"},"code":403}

error: no objects passed to scale
  1. Even if scale command needs to GET the resource, and it can't due to a permission issue, the error it should be printing is "why it can't GET the resource". So the bug is valid.

The scale.go clearly ignores the errors returned from resourcebuilder. I was wrong in saying ContinueOnError() is the problem, but this snippet seems to be the issue:

https://github.com/kubernetes/kubectl/blob/v0.31.1/pkg/cmd/scale/scale.go#L208-L212

infoErr is not checked until the very end --yet the infos variable is used despite being wrong/incomplete in case of error.

If I locally fix the code to check if infoErr != nil { return infoErr } right after this snippet, kubectl now works more like what I expected and prints the permission error:

go run ./cmd/kubectl scale kubernetespool.compute.linkedin.com/ei4.k8s-0.e2e-testing --replicas=3 -v 7
...
Error from server (Forbidden): kubernetespools.compute.linkedin.com "ei4.k8s-0.e2e-testing" is forbidden: User "new-user" cannot get resource "kubernetespools" in API group "compute.linkedin.com" at the cluster scope

Even despite this, I don't think the behavior is fully correct. I don't think kubectl scale should need access on the parent resource, as the /scale subresource has its own GET/POST/PATCH APIs that can work standalone, and has its own RBACs.

ardaguclu commented 1 month ago

In my opinion it is better to wrap this error https://github.com/kubernetes/kubectl/blob/0315be426ca25e6554f1c9089534b62ce12254d4/pkg/cmd/scale/scale.go#L232 with the error gotten from resourcebuilder.

/triage accepted /priority backlog

googs1025 commented 1 month ago

/assign ok, I will handle this