Open purha opened 1 week ago
In case when there's lots of snapshots in s3, at some point k3s will consume all memory available, and then oom killer will kick in.
Is there actually a cumulative memory leak, or is the memory required to manage the snapshots directly proportional to the number of snapshots found on disk and in S3?
If there is a cumulative memory leak, this should show up as increasing memory usage over time despite a static number of etcd snapshots.
In case when there's lots of snapshots in s3, at some point k3s will consume all memory available, and then oom killer will kick in.
Is there actually a cumulative memory leak, or is the memory required to manage the snapshots directly proportional to the number of snapshots found on disk and in S3?
If there is a cumulative memory leak, this should show up as increasing memory usage over time despite a static number of etcd snapshots.
Seems like cumulative, amount of snapshots contribute to the time that it takes the memory to run out. By disabling S3 snapshots the issue is gone and the memory usage is stable.
What are the units on your graph? Can you show the actual memory utilization of the k3s process in bytes? How many snapshots did you have in the cluster when you saw the memory utilization growing?
I'm trying to reproduce this by profiling k3s with s3 enabled, retention set to 120, and snapshots taken 1 per minute, but I'm not quite sure that I'm seeing the exact same thing as you.
I am also curious if you've tried adding a memory limit to the k3s systemd unit. By default the k3s systemd unit does not have a memory limit on it, and without any external memory pressure, golang will not free memory back to the operating system. So you could just be seeing secondary effects of k3s requiring more memory to reconcile a large number of snapshots, and golang not freeing memory until it absolutely needs to.
Just to share what I'm seeing: I do see k3s allocating a lot of memory while reconciling snapshots, but this memory is freed at the end of each snapshot save cycle. Note that the memory is allocated but no longer in use, which means that it is available to be freed or reused. This is NOT a leak, but I can try to see if there is some potential for enhancement here to avoid the momentary spike in memory during reconcile.
alloc_space
inuse_space
Just from glancing at this, I suspect that just adding some pagination to the various list operations would take the memory utilization down a lot. The current code pulls a full list into memory on every pass, which will be expensive with hundreds of snapshots.
The profiling also makes it clear that this is NOT a leak, and is not related to minio. So I am going to edit the issue title to better reflect the root of the problem.
I dont have the data anymore, but I think there was around 300 snapshots or more, from 60 days or so, and few on-demand snapshots, I deleted all but the last 14 days and you can see the from the graph that it slightly helped. And you can also see when I disabled the snapshots all together. Attached also the heap profile that I took, however at that point the k3s was already consuming gigabytes of memory. I didn't try to set memory limits for the service.
The usage is in percents and that's a node with 8Gb of memory
Environmental Info: K3s Version: v1.28.10+k3s1 (a4c5612e) go version go1.21.9
but also affects some earlier versions.
Node(s) CPU architecture, OS, and Version: Linux kubernetes-worker-f-1 6.1.0-22-arm64 #1 SMP Debian 6.1.94-1 (2024-06-21) aarch64 GNU/Linux
Issue exists also atleast on ubuntu 22.04 arm64.
Cluster Configuration: 3 servers with all roles in all of them
Describe the bug:
In case when there's lots of snapshots in s3, at some point k3s will consume all memory available, and then oom killer will kick in. Normally snapshots get cleaned, but because of bug https://github.com/k3s-io/k3s/issues/10292 they dont. This will only affect single node in cluster.
Steps To Reproduce:
Expected behavior:
Memory should not run out.
Actual behavior:
Memory runs out on node, causing oom killer to start killing processes, restarting k3s service will fix the issue for some time until memory runs out again.
Additional context / logs:
Memory consumption example graph attached.