Evaluate minimum node size

alex-dabija commented 2 years ago

Story

-As a cluster admin, I want the smallest node size to have enough resources to run a few workloads which are not deployed by default on a new cluster in order to have a better user experience and not wonder why the pods are not running.

Background

Recent AWS releases have more default applications running on a new cluster (e.g. vertical-pod-autoscaller, kiam-watchdog, etc.). Unfortunately, we haven't evaluated if the smallest node size still makes sense.

This issue became visible during recent incidents where on some nodes there weren't enough resources to run the kiam-agent and just a few customer workloads.

Tasks

[ ] Evaluate the resource usage of the default applications. Maybe some of them ask for more resources then they are using.
[ ] Evaluate if the smallest node size still makes sense.

paurosello commented 2 years ago

Empty cluster:

  aws-node:                 30m         60mb
  calico-node:              250m        150mb
  cert-exporter             50m         50mb
  csi-volume:               10m         50mb
  kiam:                     15m         100mb
  kiam-wetchdog:            200m        200mb
  kubeproxy:                35m         100mb
  net-exporter:             50m         75mb
  node-exporter:            50m         75mb
  ---------------------------------------------
                            690m        860mb

Some apps seem to consume much less than the reserved resources specially kube-state-metrics

paurosello commented 2 years ago

In deu01:

  aws-node:                 30m         150mb
  calico-node:              50m         450mb
  cert-exporter             50m         50mb
  csi-volume:               10m         100mb
  kiam:                     15m         100mb
  kiam-wetchdog:            200m        200mb
  kubeproxy:                1000m       200mb
  net-exporter:             50m         75mb
  node-exporter:            50m         75mb
  ---------------------------------------------
                            1455m       1400mb

paurosello commented 2 years ago

4 CPU cores, 16 GB RAM is the recommended size, which seems to be fine, although I think 2xlarge could improve resource usage.

Reserved resources in xlarge 17.25% 5.3% Reserved resources in 2xlarge 8.62%% 2.68%

giantswarm / roadmap