k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
27.94k stars 2.33k forks source link

[Release-1.28] - K3s etcd snapshot reconcile consumes excessive memory when a large number of snapshots are present #10561

Closed brandond closed 2 months ago

brandond commented 3 months ago

Backport fix for

aganesh-suse commented 2 months ago

Validated on release-1.28 branch with commit 2701d8fca45cf675b481e927827dd1dceb51b01c

Environment Details

Infrastructure

Node(s) CPU architecture, OS, and Version:

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"

$ uname -m
x86_64

Setup Size: 4Gb Memory, 2vCPU, 30G disc size.

Cluster Configuration:

HA: 3 server/ 1 agent

Config.yaml:

token: xxxx
cluster-init: true
write-kubeconfig-mode: "0644"
node-external-ip: 1.1.1.1
node-label:
- k3s-upgrade=server

etcd-snapshot-retention: 255
etcd-snapshot-schedule-cron: "* * * * *"
etcd-s3: true
etcd-s3-access-key: <access_key>
etcd-s3-secret-key: <secret_key>
etcd-s3-bucket: <bucket>
etcd-s3-folder: <folder>
etcd-s3-region: <region>

Testing Steps

  1. Copy config.yaml
    $ sudo mkdir -p /etc/rancher/k3s && sudo cp config.yaml /etc/rancher/k3s
  2. Install k3s
    curl -sfL https://get.k3s.io | sudo INSTALL_K3S_COMMIT='2701d8fca45cf675b481e927827dd1dceb51b01c' sh -s - server
  3. Verify Cluster Status:
    kubectl get nodes -o wide
    kubectl get pods -A
  4. Apply the k3s extra metadata here:
    kubectl apply -f https://gist.githubusercontent.com/aganesh-suse/52c3d6c3d7fe70141fa3a49431ac0032/raw/20039a159ab0f5fce1930f5ec12f6afc2b034784/k3s-etcd-snapshot-extra-metadata.yaml
  5. Monitor the memory usage of k3s.service, while taking snapshots every 1 minute for upto 255 snapshots.
    for (( I=0; I < "${ON_DEMAND_SNAPSHOT_COUNT}"; I++ ))
    do
    sudo k3s etcd-snapshot save
    sleep 5
    done
    
    write_mem_usage_k3s_to_file () {
    while true; do
        echo "$(top -b -n 1 | grep k3s)"  | tee -a top-output.log
        sleep 1
    done
    }

ttyplot_k3s_memory () { K3S_PID=$(ps aux | grep 'k3s' | head -n 1 | awk '{print $2}') while :; do grep -oP '^VmRSS:\s+\K\d+' /proc/$K3S_PID/status \ | numfmt --from-unit Ki --to-unit Mi; sleep 1; done | ttyplot -u Mi }

P.S: We run out of disc space before running out of memory space by ~280 snapshots, so capping the snapshot count to 255 for testing purposes. 

**Replication Results:**
- k3s version used for replication:
<!-- Provide the result of k3s -v -->

$ k3s -v k3s version v1.28.12+k3s1 (4717e2a5) go version go1.22.5


**Validation Results:**
- k3s version used for validation:
<!-- Provide the result of k3s -v -->

$ k3s -v k3s version v1.28.12+k3s-2701d8fc (2701d8fc) go version go1.22.5


**Memory Usage Comparison Results**

Plot to compare memory usage between released version and latest commit: 
Compare % Memory, Max Memory used in Mb, Avg Memory in Mb for both released version and latest commit for various snapshot counts - 120, 150, 170, 200, 230, 250, 255. 
      v1.28.12       |  release-1.28 commit: 2701d8fc

Snaps %M Max Avg %M Max Avg 100 47 2019 1870 41 1863 1730 120 53 2175 1994 50 1965 1843 130 56 2245 2157 53 2077 1969 150 61 2477 2347 55 2190 2113 170 68 2766 2573 59 2413 2302 200 75 3046 2841 66 2628 2564

k3s service restarted by itself around 220 snapshots for the old version. so the following comparison numbers would look skewed/lesser for the old version. so providing only latest commit version results. The results range should be similar to other branch results.  

230 67 2638 2384 250 70 2879 2538 255 73 2879 2679



**Observations**

Memory wise, till about 120 snapshots, the new commit is `2% lesser` than the older vesion memory usage.
On an average, the difference starts increasing to `~5% till 200 snapshots`, then to `~10+%` lesser memory usage for `more than 200 snapshots`.