kubernetes / client-go

Go client for Kubernetes.
Apache License 2.0
8.99k stars 2.94k forks source link

"couldn't get resource list" error #1223

Closed wyardley closed 1 year ago

wyardley commented 1 year ago

Since updating client-go (https://github.com/helm/helm/pull/11622), from v0.25.2 to v0.26.0, the new version of helm throws the following errors, apparently from client-go, at least when operating against Kube 1.25:

E0126 14:24:31.061339    6338 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E0126 14:24:31.366546    6338 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E0126 14:24:31.493404    6338 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E0126 14:24:31.698458    6338 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E0126 14:24:31.980491    6338 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1

Is that coming from this library, and if so, does it indicate a problem (and / or is it possible to suppress this error)? Maybe from an updated or removed / deprecated API version?

Phani2811 commented 1 year ago

Am also facing the same issue when i run kubectl commands from Azure DevOps pipelines. is there any workaround or resolution for fixing this issue?

kubectl version it uses is 1.26.1

JorTurFer commented 1 year ago

Hi @wyardley I guess that the issues in helm come from KEDA (or at least they could come), as we use external.metrics.k8s.io/v1beta1. This is an issue we have discovered some week ago, related with this change in kube-client.

Just for sharing the info we have, this doesn't affect only KEDA users, any implementation of custom-metrics api server is affected (KEDA, prometheus-adapter, rabbit-adapter, etc) if they don't expose any metric. The change I linked considers that 0 returned metrics is an error and that's why you see it. We have created an issue in the upstream asking about this, and they have created another issue in kubernetes repo asking about the better solution.

If your users are facing with this warning due to KEDA, there is a workaround they can use and it's to register a dummy ScaledObject to expose at least 1 metric from KEDA metrics server. Once at least 1 metric is returned, the error disappears. IDK how to apply this workaround to other projects like prometheus-adapter or other adapters, but the main idea is the same, once they expose at least 1 metric, the error disappears.

fulviodenza commented 1 year ago

Hi, I'm facing this issue while implementing a custom api server starting from apimachinery library and kubernetes/sample-apiserver. As you can see in my fork https://github.com/fulviodenza/sample-apiserver, I implemented a new version v2alpha1 which is pretty similar to the others. I generated (with not few problems) the autogenerated files and then deployed all on a cluster wardle. When I create the new resource Flunder using the apiVersion: wardle.example.com/v2alpha1, it returns the following error:

$ kubectl create -f artifacts/flunders/01-flunder.yaml
E0304 17:51:57.765739  204801 memcache.go:255] couldn't get resource list for wardle.example.com/v2alpha1: the server is currently unable to handle the request
E0304 17:51:57.773603  204801 memcache.go:106] couldn't get resource list for wardle.example.com/v2alpha1: the server is currently unable to handle the request
E0304 17:51:57.782027  204801 memcache.go:255] couldn't get resource list for wardle.example.com/v2alpha1: the server is currently unable to handle the request
E0304 17:51:57.784516  204801 memcache.go:106] couldn't get resource list for wardle.example.com/v2alpha1: the server is currently unable to handle the request
error: resource mapping not found for name: "my-first-flunder-v2alpha" namespace: "" from "artifacts/flunders/01-flunder.yaml": no matches for kind "Flunder" in version "wardle.example.com/v2alpha1"
ensure CRDs are installed first

Every command ran using minikube appends to the stdErr multiple strings as the following:

E0304 17:28:45.529114 201343 memcache.go:255] couldn't get resource list for wardle.example.com/v2alpha1: the server is currently unable to handle the request

The error is at line 255 and 106 of memcache.go.

$ kubectl get all -n wardle
E0304 17:28:45.547138  201343 memcache.go:106] couldn't get resource list for wardle.example.com/v2alpha1: the server is currently unable to handle the request
NAME                                READY   STATUS             RESTARTS       AGE
pod/wardle-server-5b65cdb5d-bcdfz   0/2     CrashLoopBackOff   14 (32s ago)   89m

NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/api   ClusterIP   10.98.135.217   <none>        443/TCP   111m

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/wardle-server   0/1     1            0           89m

NAME                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/wardle-server-5b65cdb5d   1         1         0       89m

Looking at the logs of the pod it pops out this stacktrace:

k8s.io/sample-apiserver/pkg/registry.RESTInPeace(...)
        /home/fdenza/workspace/sample-apiserver/pkg/registry/registry.go:36
k8s.io/sample-apiserver/pkg/apiserver.completedConfig.New({{0xc000289890?}, 0xc00011f148?})
        /home/fdenza/workspace/sample-apiserver/pkg/apiserver/apiserver.go:116 +0x59b
k8s.io/sample-apiserver/pkg/cmd/server.WardleServerOptions.RunWardleServer({0xc0003fe7e0, {0x0, 0x0}, {0x262cae0, 0xc000014018}, {0x262cae0, 0xc000014020}, {0x0, 0x0, 0x0}}, ...)
        /home/fdenza/workspace/sample-apiserver/pkg/cmd/server/start.go:168 +0x95
k8s.io/sample-apiserver/pkg/cmd/server.NewCommandStartWardleServer.func1(0xc0000f2300?, {0xc0005ccbd0, 0x0, 0x1})
        /home/fdenza/workspace/sample-apiserver/pkg/cmd/server/start.go:87 +0x165
github.com/spf13/cobra.(*Command).execute(0xc0000f2300, {0xc000052050, 0x1, 0x1})
        /home/fdenza/go/pkg/mod/github.com/spf13/cobra@v1.6.0/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc0000f2300)
        /home/fdenza/go/pkg/mod/github.com/spf13/cobra@v1.6.0/command.go:1040 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        /home/fdenza/go/pkg/mod/github.com/spf13/cobra@v1.6.0/command.go:968
k8s.io/component-base/cli.run(0xc0000f2300)
        /home/fdenza/go/pkg/mod/k8s.io/component-base@v0.0.0-20230301013520-2acccc807c76/cli/run.go:146 +0x317
k8s.io/component-base/cli.Run(0xc0000c6280?)
        /home/fdenza/go/pkg/mod/k8s.io/component-base@v0.0.0-20230301013520-2acccc807c76/cli/run.go:46 +0x1d
main.main()
        /home/fdenza/workspace/sample-apiserver/main.go:31 +0x4a

The line at https://github.com/fulviodenza/sample-apiserver/blob/0b7b20d2f7bc7f276d3a7f12b2fb9b5a478df3cc/pkg/apiserver/apiserver.go#L116 seems to be located the error, NewREST returns an error.

Infos about the system:

I'm running Fedora 37 with:

$ minikube version
minikube version: v1.29.0
commit: ddac20b4b34a9c8c857fc602203b6ba2679794d3

and starting the minikube cluster in the following way: minikube start --driver=podman --kubernetes-version=v1.24.9 --container-runtime=containerd

Looking on the web I've found out that the issue has been detected on some metrics component I think this could be relevant as involves lot of external tools

cbugneac-nex commented 1 year ago

I'm also getting similar error messages after installing Calico:

E0327 17:43:47.602454    1815 memcache.go:255] couldn’t get resource list for [projectcalico.org/v3](http://projectcalico.org/v3): the server is currently unable to handle the request
[12](https://github.com/*******/actions/runs/4534923866/jobs/7989784488#step:11:13)
E0327 17:43:47.702016    1815 memcache.go:106] couldn’t get resource list for [projectcalico.org/v3](http://projectcalico.org/v3): the server is currently unable to handle the request
[13](https://github.com/*******/actions/runs/4534923866/jobs/7989784488#step:11:14)
E0327 17:43:48.511114    1815 memcache.go:255] couldn’t get resource list for [projectcalico.org/v3](http://projectcalico.org/v3): the server is currently unable to handle the request
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.12", GitCommit:"ef70d260f3d036fc22b30538576bbf6b36329995", GitTreeState:"clean", BuildDate:"2023-03-15T13:37:18Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.10-eks-48e63af", GitCommit:"9176fb99b52f8d5ff73d67fea27f3a638f679f8a", GitTreeState:"clean", BuildDate:"2023-01-24T19:17:48Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
ndacic commented 1 year ago

@cbugneac-nex facing the same issue. Did you manage to resolve it somehow?

cbugneac-nex commented 1 year ago

Nope @ndacic

bergkvist commented 1 year ago

Has this been fixed? Or why is it closed?

mjf commented 1 year ago
kubectl get pods
E0526 07:44:05.737335 4086336 memcache.go:255] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request
E0526 07:44:05.738529 4086336 memcache.go:106] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request
E0526 07:44:05.742066 4086336 memcache.go:106] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request
E0526 07:44:05.747980 4086336 memcache.go:106] couldn't get resource list for packages.operators.coreos.com/v1: the server is currently unable to handle the request
No resources found in default namespace.
kubectl --output=yaml version
clientVersion:
  buildDate: "2022-12-08T19:58:30Z"
  compiler: gc
  gitCommit: b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d
  gitTreeState: clean
  gitVersion: v1.26.0
  goVersion: go1.19.4
  major: "1"
  minor: "26"
  platform: linux/amd64
kustomizeVersion: v4.5.7
serverVersion:
  buildDate: "2022-12-08T10:08:09Z"
  compiler: gc
  gitCommit: 804d6167111f6858541cef440ccc53887fbbc96a
  gitTreeState: clean
  gitVersion: v1.25.5
  goVersion: go1.19.4
  major: "1"
  minor: "25"
  platform: linux/amd64

Should I downgrade the kubectl client?

JeTondsLeGazon commented 1 year ago

I have the same problem when running kubectl get pods ... in a job. The error message is slighly different though, but still form memcache.go:287:


E0615 08:01:16.113940       7 memcache.go:287] couldn't get resource list for snapshot.storage.k8s.io/v1beta1: Get "https://10.0.0.1:443/apis/snapshot.storage.k8s.io/v1beta1?timeout=32s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
E0615 08:01:16.014523       7 memcache.go:287] couldn't get resource list for security.istio.io/v1beta1: Get "https://10.0.0.1:443/apis/security.istio.io/v1beta1?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E0615 08:01:16.613929       7 memcache.go:287] couldn't get resource list for extensions.istio.io/v1alpha1: Get "https://10.0.0.1:443/apis/extensions.istio.io/v1alpha1?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E0615 08:01:16.614055       7 memcache.go:287] couldn't get resource list for templates.gatekeeper.sh/v1: Get "https://10.0.0.1:443/apis/templates.gatekeeper.sh/v1?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
...
MMquant commented 1 year ago

Same problem here

$ k version -o yaml
clientVersion:
  buildDate: "2023-04-11T17:10:18Z"
  compiler: gc
  gitCommit: 1b4df30b3cdfeaba6024e81e559a6cd09a089d65
  gitTreeState: clean
  gitVersion: v1.27.0
  goVersion: go1.20.3
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-14T09:49:08Z"
  compiler: gc
  gitCommit: 11902a838028edef305dfe2f96be929bc4d114d8
  gitTreeState: clean
  gitVersion: v1.26.6
  goVersion: go1.19.10
  major: "1"
  minor: "26"
  platform: linux/amd64

$ k get apiservices.apiregistration.k8s.io 
E0704 09:06:28.306502 1266785 memcache.go:287] couldn't get resource list for flowcontrol.apiserver.k8s.io/v1beta1: the server could not find the requested resource
E0704 09:06:28.314128 1266785 memcache.go:287] couldn't get resource list for autoscaling/v2beta2: the server could not find the requested resource
E0704 09:06:28.477705 1266785 memcache.go:287] couldn't get resource list for flowcontrol.apiserver.k8s.io/v1beta1: the server could not find the requested resource
E0704 09:06:28.484418 1266785 memcache.go:287] couldn't get resource list for autoscaling/v2beta2: the server could not find the requested resource

Downgrading kubectl to 1.26.6 didn't solve the problem.

MMquant commented 1 year ago

We found a solution. After upgrading a cluster some of the kube-system pods images were not upgraded. Check if your image tags are consistent in the kube-system namespace.

kubectl get pods -n kube-system --output=custom-columns="NAME:.metadata.name,IMAGE:.spec.containers[*].image"
nocodebackend commented 1 year ago

I found this thread and it helped me fix the problem. https://github.com/helm/helm/issues/11772

From what I gathered, the errors may have always been there but just silenced. I ran into this issue myself after upgrading client-go and removing some old resources for custom metrics. The issue was not cleaning up the apiservice entries that these custom metrics components used.

Run kubectl get apiservices and you will see what is causing a problem in the available column. You can then delete those resources.