cpu limit display is wrong if multiple container run in a pod and only a subset have limits

runningman84 commented 10 months ago

Describe the bug k9s does not calculate the right amount of cpu limit usage in % if a pod contains a container without a cpu limit and one ore more container with a cpu limit.

To Reproduce Steps to reproduce the behavior:

Run a pod with multiple container:
first container cpu limit of 10m (actual usage is 1m)
n other containers no cpu limit

Expected behavior CPU Limit in % should be 10%, but k9s shows 800% CPU limit

Screenshots na

Versions (please complete the following information):

OS: [e.g. OSX] OSX
K9s: [e.g. 0.1.0] 0.27.4
K8s: [e.g. 1.11.0] 1.27.4

Additional context Add any other context about the problem here.

derailed commented 7 months ago

@runningman84 Can't seem to repro ;( Could you update to the latest k9s rev and send out your repro scenario.? Thank you!!

runningman84 commented 7 months ago

The issue is still the same in v 0.28.2


 Context: dev-mgmt                                                    <0> all          <a>      Attach     <l>       Logs            <f> Show PortForward                               ____  __.________        
 Cluster: arn:aws:eks:eu-central-1:xxx:cluster/dev-mgmt      <1> monitoring   <ctrl-d> Delete     <p>       Logs Previous   <t> Transfer                                      |    |/ _/   __   \______ 
 User:    arn:aws:eks:eu-central-1:xxx:cluster/dev-mgmt      <2> karpenter    <d>      Describe   <shift-f> Port-Forward    <y> YAML                                          |      < \____    /  ___/ 
 K9s Rev: v0.28.2                                                     <3> default      <e>      Edit       <z>       Sanitize                                                          |    |  \   /    /\___ \  
 K8s Rev: v1.28.3-eks-4f4795d                                                          <?>      Help       <s>       Shell                                                             |____|__ \ /____//____  > 
 CPU:     4%↓                                                                          <ctrl-k> Kill       <n>       Show Node                                                                 \/            \/  
 MEM:     44%                                                                                                                                                                                                    
┌──────────────────────────────────────────────────────────────────────────────────────────── Pods(monitoring)[12] ─────────────────────────────────────────────────────────────────────────────────────────────┐
│ NAME↑                                                    PF READY RESTARTS STATUS     CPU   MEM  %CPU/R  %CPU/L  %MEM/R  %MEM/L IP                NODE                                               AGE      │
│ central-alertmanager-endpoint-proxy-0-b9b565d66-7ptbm    ●  1/1          0 Running      1     3      20     n/a      22       5 192.168.24.238    ip-192-168-4-215.eu-central-1.compute.internal     3h22m    │
│ central-alertmanager-endpoint-proxy-1-55cb544f56-wdnzh   ●  1/1          0 Running      1     3      12     n/a      22       5 192.168.141.198   ip-192-168-173-235.eu-central-1.compute.internal   3h11m    │
│ central-alertmanager-endpoint-proxy-2-658c5ff494-44f4c   ●  1/1          0 Running      1     4      20     n/a      29       7 192.168.98.34     ip-192-168-71-124.eu-central-1.compute.internal    3h22m    │
│ prometheus-operator-grafana-7949f89b49-jv7s9             ●  3/3          0 Running      5   269      14     n/a      99      48 192.168.142.129   ip-192-168-173-235.eu-central-1.compute.internal   3h21m    │
│ prometheus-operator-kube-p-operator-59d98b75f6-j8bbw     ●  1/1          0 Running      1    42       2     n/a      30      17 192.168.167.40    ip-192-168-173-235.eu-central-1.compute.internal   2m35s    │
│ prometheus-operator-kube-state-metrics-6cd7b7fc56-45qbj  ●  1/1          0 Running      2    54      40     n/a      42      21 192.168.133.64    ip-192-168-179-203.eu-central-1.compute.internal   2m14s    │
│ prometheus-operator-prometheus-node-exporter-6sndr       ●  1/1          0 Running      3    10      60     n/a     103      20 192.168.4.215     ip-192-168-4-215.eu-central-1.compute.internal     3h21m    │
│ prometheus-operator-prometheus-node-exporter-9lg97       ●  1/1          0 Running      2    11      40     n/a     113      22 192.168.71.124    ip-192-168-71-124.eu-central-1.compute.internal    3h22m    │
│ prometheus-operator-prometheus-node-exporter-r8gxd       ●  1/1          0 Running      3    13      60     n/a     135      27 192.168.179.203   ip-192-168-179-203.eu-central-1.compute.internal   3h21m    │
│ prometheus-operator-prometheus-node-exporter-zg79r       ●  1/1          0 Running      3    10      60     n/a     105      21 192.168.173.235   ip-192-168-173-235.eu-central-1.compute.internal   3h21m    │
│ prometheus-prometheus-operator-kube-p-prometheus-0       ●  3/3          0 Running     47  1323      40     427      90      45 192.168.81.216    ip-192-168-71-124.eu-central-1.compute.internal    3h18m    │
│ prometheus-prometheus-operator-kube-p-prometheus-1       ●  3/3          0 Running     73  1320      63     663      95      47 192.168.44.90     ip-192-168-4-215.eu-central-1.compute.internal     3h19m    │
│                                                                                                                                                                                                               │

Context: dev-mgmt                                                    <a>       Attach          <f> Show PortForward                                                                    ____  __.________        
 Cluster: arn:aws:eks:eu-central-1:xxx:cluster/dev-mgmt      <?>       Help                                                                                                   |    |/ _/   __   \______ 
 User:    arn:aws:eks:eu-central-1:xxx:cluster/dev-mgmt      <l>       Logs                                                                                                   |      < \____    /  ___/ 
 K9s Rev: v0.28.2                                                     <p>       Logs Previous                                                                                          |    |  \   /    /\___ \  
 K8s Rev: v1.28.3-eks-4f4795d                                         <shift-f> PortForward                                                                                            |____|__ \ /____//____  > 
 CPU:     4%                                                          <s>       Shell                                                                                                          \/            \/  
 MEM:     44%                                                                                                                                                                                                    
┌──────────────────────────────────────────────────────────────── Containers(monitoring/prometheus-prometheus-operator-kube-p-prometheus-0)[4] ─────────────────────────────────────────────────────────────────┐
│ NAME↑                 PF IMAGE                                                           READY STATE      INIT   RESTARTS PROBES(L:R) CPU  MEM CPU/R:L   MEM/R:L %CPU/R %CPU/L %MEM/R %MEM/L PORTS            │
│ config-reloader       ●  quay.io/prometheus-operator/prometheus-config-reloader:v0.68.0  true  Running    false         0 off:off       1   22   11:11     22:44      9      9     98     49 reloader-web:808 │
│ init-config-reloader  ●  quay.io/prometheus-operator/prometheus-config-reloader:v0.68.0  true  Completed  true          0 off:off       0    0   50:50     25:50      0      0      0      0 reloader-web:808 │
│ prometheus            ●  quay.io/prometheus/prometheus:v2.47.1                           true  Running    false         0 on:on        51 1275    93:0 1402:2804     54    n/a     90     45 http-web:9090    │
│ thanos-sidecar        ●  quay.io/thanos/thanos:v0.31.0                                   true  Running    false         0 off:off       1   27    11:0     34:69      9    n/a     80     40 http:10902,grpc: │

derailed commented 6 months ago

@runningman84 Thank you for the details! So looking at this pod current cpu usage is 47m.
Containers total cpu limit is 11m. Thus the pod cpu usage is currently 4.27x more than what is allocated.
What am I missing?

runningman84 commented 6 months ago

As you see most of the containers do not have a cpu limit at all. It looks like k9s just adds up all limits and divide them with the current usage.

It should just ignore the containers who do not have cpu limits set.

danielbarron42 commented 6 months ago

We are getting the same issue.

dyang108 commented 5 months ago

i'm seeing the same thing

verfriemelt-dot-org commented 2 months ago

same :+1:

patsevanton commented 2 months ago

Same issue and

derailed / k9s

cpu limit display is wrong if multiple container run in a pod and only a subset have limits #2196