kubernetes / dns

Kubernetes DNS service
Apache License 2.0
930 stars 468 forks source link

k8s node-local-dns high slab memory consumption leading to OOM #621

Closed lordofire closed 9 months ago

lordofire commented 9 months ago

Hi all,

Recently we met some OOM issues on node-local-dns with the following backtrace:

[Fri Jan 26 05:46:44 2024] node-cache invoked oom-killer: gfp_mask=0xc40(GFP_NOFS), order=0, oom_score_adj=-997
[Fri Jan 26 05:46:44 2024] CPU: 44 PID: 3939051 Comm: node-cache Tainted: P           OE     5.4.0-156-generic #173-Ubuntu
[Fri Jan 26 05:46:44 2024] Hardware name: Supermicro SYS-4029GP-TRT2/X11DPG-OT-CPU, BIOS 3.8b 01/17/2023
[Fri Jan 26 05:46:44 2024] Call Trace:
[Fri Jan 26 05:46:44 2024]  dump_stack+0x6d/0x8b
[Fri Jan 26 05:46:44 2024]  dump_header+0x4f/0x1eb
[Fri Jan 26 05:46:44 2024]  oom_kill_process.cold+0xb/0x10
[Fri Jan 26 05:46:44 2024]  out_of_memory+0x1cf/0x500
[Fri Jan 26 05:46:44 2024]  mem_cgroup_out_of_memory+0xbd/0xe0
[Fri Jan 26 05:46:44 2024]  try_charge+0x77c/0x810
[Fri Jan 26 05:46:44 2024]  mem_cgroup_try_charge+0x71/0x190
[Fri Jan 26 05:46:44 2024]  __add_to_page_cache_locked+0x2ff/0x3f0
[Fri Jan 26 05:46:44 2024]  ? bio_add_page+0x6a/0x90
[Fri Jan 26 05:46:44 2024]  ? scan_shadow_nodes+0x30/0x30
[Fri Jan 26 05:46:44 2024]  add_to_page_cache_lru+0x4d/0xd0
[Fri Jan 26 05:46:44 2024]  iomap_readpages_actor+0xf8/0x220
[Fri Jan 26 05:46:44 2024]  iomap_apply+0xd5/0x160
[Fri Jan 26 05:46:44 2024]  ? iomap_page_mkwrite_actor+0x80/0x80
[Fri Jan 26 05:46:44 2024]  iomap_readpages+0xa3/0x190
[Fri Jan 26 05:46:44 2024]  ? iomap_page_mkwrite_actor+0x80/0x80
[Fri Jan 26 05:46:44 2024]  xfs_vm_readpages+0x35/0x90 [xfs]
[Fri Jan 26 05:46:44 2024]  read_pages+0x71/0x1a0
[Fri Jan 26 05:46:44 2024]  __do_page_cache_readahead+0x180/0x1a0
[Fri Jan 26 05:46:44 2024]  filemap_fault+0x697/0xa50
[Fri Jan 26 05:46:44 2024]  ? xas_load+0xd/0x80
[Fri Jan 26 05:46:44 2024]  ? _cond_resched+0x19/0x30
[Fri Jan 26 05:46:44 2024]  ? down_read+0x13/0xa0
[Fri Jan 26 05:46:44 2024]  __xfs_filemap_fault+0x6c/0x200 [xfs]
[Fri Jan 26 05:46:44 2024]  xfs_filemap_fault+0x37/0x40 [xfs]
[Fri Jan 26 05:46:44 2024]  __do_fault+0x3c/0x170
[Fri Jan 26 05:46:44 2024]  do_fault+0x24b/0x640
[Fri Jan 26 05:46:44 2024]  __handle_mm_fault+0x4c5/0x7a0
[Fri Jan 26 05:46:44 2024]  handle_mm_fault+0xca/0x200
[Fri Jan 26 05:46:44 2024]  do_user_addr_fault+0x1f9/0x450
[Fri Jan 26 05:46:44 2024]  __do_page_fault+0x58/0x90
[Fri Jan 26 05:46:44 2024]  do_page_fault+0x2c/0xe0
[Fri Jan 26 05:46:44 2024]  page_fault+0x34/0x40
[Fri Jan 26 05:46:44 2024] RIP: 0033:0x438950
[Fri Jan 26 05:46:44 2024] Code: Bad RIP value.
[Fri Jan 26 05:46:44 2024] RSP: 002b:000000c000123f28 EFLAGS: 00010212
[Fri Jan 26 05:46:44 2024] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000045da3d
[Fri Jan 26 05:46:44 2024] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000c000123f08
[Fri Jan 26 05:46:44 2024] RBP: 000000c000123f20 R08: 0000000000000000 R09: 0000000000000000
[Fri Jan 26 05:46:44 2024] R10: 00007fff43b2a090 R11: 0000000000000202 R12: 0000000000430e30
[Fri Jan 26 05:46:44 2024] R13: 0000000000000011 R14: 00000000019b4b78 R15: 0000000000000000
[Fri Jan 26 05:46:44 2024] memory: usage 399360kB, limit 399360kB, failcnt 1796473
[Fri Jan 26 05:46:44 2024] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[Fri Jan 26 05:46:44 2024] kmem: usage 391164kB, limit 9007199254740988kB, failcnt 0
[Fri Jan 26 05:46:44 2024] Memory cgroup stats for /kubepods/podd9f57d24-67fd-4fdd-924b-780799ce4ba4:
[Fri Jan 26 05:46:44 2024] anon 0
                           file 13606912
                           kernel_stack 4276224
                           slab 325726208
                           sock 0
                           shmem 0
                           file_mapped 5947392
                           file_dirty 0
                           file_writeback 0
                           anon_thp 0
                           inactive_anon 0
                           active_anon 1826816
                           inactive_file 319488
                           active_file 6451200
                           unevictable 0
                           slab_reclaimable 23388160
                           slab_unreclaimable 302338048
                           pgfault 18332047
                           pgmajfault 344907
                           workingset_refault 6429196
                           workingset_activate 1211603
                           workingset_nodereclaim 33
                           pgrefill 3920017
                           pgscan 10044063
                           pgsteal 6473127
                           pgactivate 1536300
                           pgdeactivate 2788986
                           pglazyfree 0
                           pglazyfreed 0
                           thp_fault_alloc 5
                           thp_collapse_alloc 0
[Fri Jan 26 05:46:44 2024] Tasks state (memory values in pages):
[Fri Jan 26 05:46:44 2024] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[Fri Jan 26 05:46:44 2024] [ 105029] 65535 105029      241        1    28672        0          -998 pause
[Fri Jan 26 05:46:44 2024] [3939010]     0 3939010    34992     1783   147456        0          -997 node-cache
[Fri Jan 26 05:46:44 2024] [3939926]     0 3939926     3986       28    73728        0          -997 iptables
[Fri Jan 26 05:46:44 2024] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=5581bad107d4d1f42a8659b6620f8eafc5f8bd6861f1456c2db1f1bd3bf4fae7,mems_allowed=0-1,oom_memcg=/kubepods/podd9f57d24-67fd-4fdd-924b-780799ce4ba4,task_memcg=/kubepods/podd9f57d24-67fd-4fdd-924b-780799ce4ba4/5581bad107d4d1f42a8659b6620f8eafc5f8bd6861f1456c2db1f1bd3bf4fae7,task=node-cache,pid=3939010,uid=0
[Fri Jan 26 05:46:44 2024] Memory cgroup out of memory: Killed process 3939010 (node-cache) total-vm:139968kB, anon-rss:7232kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:144kB oom_score_adj:-997

In our node-local-dns usage, we set the memory limit to be ~390Mi, and it seems in this OOM, most of the memory is consumed by slab memory: 325726208, mentioned in the above log. Can anyone explain how the slab memory consumption is accumulated and how to estimate the limit properly?

Any insight will be helpful. We are using k8s-dns-node-cache:1.15.10, and we leave mostly the config default, like number of concurrent queries, etc.

lordofire commented 9 months ago

Seems the issue is related to the kernel cgroup memory issue, so it is not related to k8s node-local-dns. Close the issue here.