Open garloff opened 1 month ago
13683 kubeadmconfigtemplates:
cluster2 capi-openstack-alpha-1-28 93d
cluster4 capi-openstack-alpha-1-28 93d
cluster4 cs-cluster4-capi-openstack-alpha-1-28-ljnkh 93d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-222ck 55d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-224lk 14d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-225pj 68d
[...]
15646 openstackmachinetemplates:
cluster2 capi-openstack-alpha-1-28 94d
cluster2 capi-openstack-alpha-1-28-control-plane 94d
cluster4 capi-openstack-alpha-1-28 93d
cluster4 capi-openstack-alpha-1-28-control-plane 93d
cluster4 cs-cluster4-capi-openstack-alpha-1-28-mmjrw 93d
cluster4 cs-cluster4-xlh9r 93d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-226gt 87d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-226qq 76m
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-2275d 92d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-228jx 88d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-229r2 89d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-229vc 79d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-22b2t 30d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-22bm4 73d
cluster4 cs-cluster4a-capi-openstack-alpha-1-28-22cl4 84d
[...]
kubectl delete -n cluster4 kubeadmtemplate <LIST OF 13000 names>
takes more than an hour, but seems to help memory usage. Same for openstackmachinetemplate. I also did compacting and defragmenting on etcd to recover.
/kind bug
What steps did you take and what happened: A management cluster (kind) running in an SCS-2V-4 VM for 3 months (mostly idle) became unusable. After some debugging, it was found that the kube-apiserver's memory usage had exploded to > 2GiB RSS. This caused the machine to aggressively discard memory (kswapd0) just to hit major page faults resulting in the memory to be paged back in. System load > 50 (on a 2vCPU server), >>10k major page faults/s and >500MB/s reading from disk.
What did you expect to happen: 4GiB should be sufficient RAM for a not too busy management host.
Anything else you would like to add: I was assuming that the CSO/CSPO are causing the kube-apiserver memory usage by storing too many objects. I thus far found kubeadmconfigtemplates and clusterclasses to exist in excessive numbers.
Environment: