awslabs / eks-node-viewer

EKS Node Viewer
Apache License 2.0
1.2k stars 105 forks source link

Usage details are not being shown in case of hundreds of nodes #298

Open aaj-synth opened 4 months ago

aaj-synth commented 4 months ago

We currently have hundreds of nodes running in the cluster. Butt the usage of these nodes is not showing up at the moment, only the number and hourly charge of those nodes.

aaj-synth commented 4 months ago

@tzneal @bwagner5 Would love to provide more context to help fix this issue :-)

tzneal commented 4 months ago

Can you provide a screenshot of what the display looks like? Are there any error messages?

aaj-synth commented 4 months ago

Certainly!

We have a dev cluster which has a dozen or so nodes, and nodeviewer works perfect over there, but on the prod cluster, we are running a massive fleet and the nodeviewer is unable to show anything,

Screenshot 2024-07-15 at 09 56 08

I just noticed that if you leave the nodeviewer open for few moments, the following error shows up:

W0715 10:22:51.560482   83198 reflector.go:535] pkg/mod/k8s.io/client-go@v0.28.4/tools/cache/reflector.go:229: failed to list *v1.Pod: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 5; INTERNAL_ERROR; received from peer
I0715 10:22:51.562153   83198 trace.go:236] Trace[1210079846]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.28.4/tools/cache/reflector.go:229 (15-Jul-2024 10:21:50.428) (total time: 61132ms):
Trace[1210079846]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 5; INTERNAL_ERROR; received from peer 61131ms (10:22:51.559)
Trace[1210079846]: [1m1.132680417s] [1m1.132680417s] END
E0715 10:22:51.562358   83198 reflector.go:147] pkg/mod/k8s.io/client-go@v0.28.4/tools/cache/reflector.go:229: Failed to watch *v1.Pod: failed to list *v1.Pod: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 5; INTERNAL_ERROR; received from peer