google / buzzer

Apache License 2.0
365 stars 30 forks source link

OOM-killer when RAM is set to 20G or 40G #62

Open Clingto opened 1 month ago

Clingto commented 1 month ago

Hi, I ran buzzer with 20G RAM and OOM-killer problem occurred as mentioned in https://github.com/google/buzzer/issues/58. When I increased the RAM to 40G, the problem still exists as below. I think there might be a memory leak in buzzer. Could you help me with this?

In addition, I used the -strategy=pointer_arithmetic which is tagged deprecated, and I didn't find any bugs. May I know which -strategy is more likely to find bugs?

from 262 to 264: R0=map_value(ks=4,vs=8) R6=scalar() R7=0xffffffff80000000 R8=0x372b167a R9=map_ptr(ks=4,vs=8) R10=fp0 fp-8=mmmm????
264: R0=map_value(ks=4,vs=8) R6=scalar() R7=0xffffffff80000000 R8=0x372b167a R9=map_ptr(ks=4,vs=8) R10=fp0 fp-8=mmmm????
264: (0f) r0 += r8
mark_precise: frame0: last_idx 264 first_idx 262 subseq_idx -1 
mark_precise: frame0: regs=r8 stack= before 262: (55) if r0 != 0x0 goto pc+1
mark_precise: frame0: parent state regs=r8 stack=:  R0_rw=map_value_or_null(id=2,ks=4,vs=8) R6=scalar() R7=0xffffffff80000000 R8_r=P0x372b167a R9=map_ptr(ks=4,vs=8) R10=fp0 f?
mark_precise: frame0: last_idx 261 first_idx 254 subseq_idx 262 
mark_precise: frame0: regs=r8 stack= before 261: (85) call bpf_map_lookup_elem#1
mark_precise: frame0: regs=r8 stack= before 260: (bf) r1 = r9
mark_precise: frame0: regs=r8 stack= before 259: (07) r2 += -4
mark_precise: frame0: regs=r8 stack= before 258: (bf) r2 = r10
mark_precise: frame0: regs=r8 stack= before 257: (62) *(u32 *)(r10 -4) = 1
mark_precise: frame0: regs=r8 stack= before 256: (7a) *(u64 *)(r0 +0) = 51966
mark_precise: frame0: regs=r8 stack= before 254: (55) if r0 != 0x0 goto pc+1
mark_precise: frame0: parent state regs=r8 stack=:  R0_rw=map_value_or_null(id=1,ks=4,vs=8) R6_w=scalar() R7=0xffffffff80000000 R8_rw=P0x372b167a R9_rw=map_ptr(ks=4,vs=8) R10?
mark_precise: frame0: last_idx 253 first_idx 237 subseq_idx 254 
mark_precise: frame0: regs=r8 stack= before 253: (85) call bpf_map_lookup_elem#1
mark_precise: frame0: regs=r8 stack= before 252: (bf) r1 = r9
mark_precise: frame0: regs=r8 stack= before 251: (07) r2 += -4
mark_precise: frame0: regs=r8 stack= before 250: (bf) r2 = r10
mark_precise: frame0: regs=r8 stack= before 249: (62) *(u32 *)(r10 -4) = 0
mark_precise: frame0: regs=r8 stack= before 247: (18) r9 = 0xffff888119f4fc00
mark_precise: frame0: regs=r8 stack= before 246: (bf) r8 = r1
mark_precise: frame0: regs=r1 stack= before 245: (37) r6 /= -1198429702
mark_precise: frame0: regs=r1 stack= before 244: (87) r2 = -r2
mark_precise: frame0: regs=r1 stack= before 243: (4c) w6 |= w1
mark_precise: frame0: regs=r1 stack= before 242: (47) r9 |= -1347492013
mark_precise: frame0: regs=r1 stack= before 241: (54) w5 &= 1831304382
mark_precise: frame0: regs=r1 stack= before 240: (af) r9 ^= r0
mark_precise: frame0: regs=r1 stack= before 239: (1d) if r7 == r1 goto pc+5
mark_precise: frame0: regs=r1,r7 stack= before 238: (b7) r3 = -73889241
mark_precise: frame0: regs=r1,r7 stack= before 237: (1c) w9 -= w3
mark_precise: frame0: parent state regs= stack=:  R0_r=0xffffffffc4ff83eb R1_rw=P0x372b167a R2_r=1 R3_r=0xffffffff9f1e1499 R4_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0
math between map_value pointer and 925570682 is not allowed
processed 47 insns (limit 1000000) max_states_per_insn 0 total_states 4 peak_states 4 mark_read 2

[155991.590243] buzzer invoked oom-killer: gfp_mask=0x2dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO), order=0, oom_score_adj=0
[155991.590687] CPU: 0 PID: 219 Comm: buzzer Not tainted 6.10.0-rc1-00027-g4a4be1ad3a6e #2
[155991.590924] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[155991.591172] Call Trace:
[155991.591260]  <TASK>
[155991.591331]  dump_stack_lvl+0xab/0xe0
[155991.591455]  dump_header+0x102/0xc20
[155991.591573]  ? _raw_spin_lock+0x80/0xe0
[155991.591695]  ? ___ratelimit+0x99/0x460
[155991.591817]  oom_kill_process+0x846/0xe20
[155991.591946]  ? task_will_free_mem+0xac/0x650
[155991.592082]  ? oom_badness+0x522/0x670
[155991.592204]  out_of_memory+0x2cb/0x1af0
[155991.592328]  ? __pfx_out_of_memory+0x10/0x10
[155991.592464]  ? __pfx_mutex_trylock+0x10/0x10
[155991.592600]  ? zone_reclaimable_pages+0x74f/0x8e0
[155991.592749]  __alloc_pages_noprof+0x1884/0x1ec0
[155991.592894]  ? update_load_avg+0x124/0x1fd0
[155991.593028]  ? sysvec_call_function_single+0x18/0xc0
[155991.593182]  ? __pfx___alloc_pages_noprof+0x10/0x10
[155991.593335]  ? sysvec_call_function_single+0x18/0xc0
[155991.593489]  ? sysvec_call_function_single+0x18/0xc0
[155991.593643]  ? sysvec_apic_timer_interrupt+0xf/0x80
[155991.593795]  ? __sanitizer_cov_trace_switch+0x54/0x90
[155991.593954]  ? policy_nodemask+0xeb/0x4b0
[155991.594083]  alloc_pages_mpol_noprof+0xf2/0x330
[155991.594228]  ? __pfx_alloc_pages_mpol_noprof+0x10/0x10
[155991.594389]  ? alloc_pages_noprof+0x139/0x150
[155991.594529]  ? __sanitizer_cov_trace_pc+0x17/0x60
[155991.594678]  __vmalloc_node_range_noprof+0xa87/0x1310
[155991.594838]  ? kcov_ioctl+0x4f/0x6a0
[155991.594955]  ? __pfx___vmalloc_node_range_noprof+0x10/0x10
[155991.595125]  ? __pfx_do_sys_openat2+0x10/0x10
[155991.595267]  ? kcov_ioctl+0x4f/0x6a0
[155991.595383]  vmalloc_user_noprof+0x9e/0xe0
[155991.595514]  ? kcov_ioctl+0x4f/0x6a0
[155991.595629]  kcov_ioctl+0x4f/0x6a0
[155991.595741]  ? __pfx_kcov_ioctl+0x10/0x10
[155991.595871]  __x64_sys_ioctl+0x1a1/0x210
[155991.596003]  do_syscall_64+0xa6/0x1a0
[155991.596121]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[155991.596279] RIP: 0033:0xaddb7f
[155991.596399] Code: Unable to access opcode bytes at 0xaddb55.
[155991.596570] RSP: 002b:00007fadae1cdde0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[155991.596796] RAX: ffffffffffffffda RBX: 000000c000052d08 RCX: 0000000000addb7f
[155991.597009] RDX: 0000000004000000 RSI: 0000000080086301 RDI: 0000000000000007
[155991.597222] RBP: 00007fadae1cde60 R08: 0000000000000000 R09: 0000000000000000
[155991.597434] R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
[155991.597647] R13: 0000000000000008 R14: 000000c0000061a0 R15: 0000000000000020
[155991.597860]  </TASK>
[155991.597941] Mem-Info:
[155991.598016] active_anon:27 inactive_anon:8402365 isolated_anon:0
[155991.598016]  active_file:0 inactive_file:2 isolated_file:0
[155991.598016]  unevictable:0 dirty:0 writeback:0
[155991.598016]  slab_reclaimable:9131 slab_unreclaimable:296105
[155991.598016]  mapped:6483 shmem:10296 pagetables:16815
[155991.598016]  sec_pagetables:0 bounce:0
[155991.598016]  kernel_misc_reclaimable:0
[155991.598016]  free:47811 free_pcp:29085 free_cma:0
[155991.599280] Node 0 active_anon:108kB inactive_anon:33609460kB active_file:0kB inactive_file:156kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:25932kB dirs
[155991.600386] Node 0 DMA free:14968kB boost:0kB min:8kB low:20kB high:32kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB uneviB
[155991.601526] lowmem_reserve[]: 0 2891 34893 0
[155991.601809] Node 0 DMA32 free:129084kB boost:0kB min:1980kB low:4940kB high:7900kB reserved_highatomic:0KB active_anon:0kB inactive_anon:2780836kB active_file:56kB inactiB
[155991.603028] lowmem_reserve[]: 0 0 32001 0
[155991.603202] Node 0 Normal free:47192kB boost:61440kB min:83360kB low:116128kB high:148896kB reserved_highatomic:0KB active_anon:108kB inactive_anon:30828624kB active_fileB
[155991.604266] lowmem_reserve[]: 0 0 0 0
[155991.604403] Node 0 DMA: 2*4kB (M) 1*8kB (M) 1*16kB (M) 3*32kB (M) 0*64kB 4*128kB (M) 4*256kB (M) 4*512kB (M) 5*1024kB (UM) 3*2048kB (M) 0*4096kB = 14976kB
[155991.604962] Node 0 DMA32: 889*4kB (M) 279*8kB (UM) 154*16kB (UM) 121*32kB (M) 56*64kB (M) 72*128kB (M) 55*256kB (UM) 51*512kB (M) 40*1024kB (UM) 11*2048kB (UM) 0*4096kB =B
[155991.605531] Node 0 Normal: 824*4kB (ME) 1028*8kB (UME) 669*16kB (UME) 552*32kB (UME) 113*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 47120kB
[155991.606047] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[155991.606371] 10602 total pagecache pages
[155991.606501] 0 pages in swap cache
[155991.606664] Free swap  = 0kB
[155991.606759] Total swap = 0kB
[155991.606914] 10485630 pages RAM
[155991.607017] 0 pages HighMem/MovableOnly
[155991.607192] 1546389 pages reserved
[155991.607302] Tasks state (memory values in pages):
[155991.607506] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
[155991.607904] [     87]     0    87    21301    11108      192      131     10785   212992        0          -250 systemd-journal
[155991.608311] [    102]     0   102     8952     2570     2570        0         0   102400        0         -1000 systemd-udevd
[155991.608712] [    127]     0   127     1409      105       64       41         0    57344        0             0 cron
[155991.609098] [    146]     0   146    55233      476      340      136         0    77824        0             0 rsyslogd
[155991.609490] [    167]     0   167    24971      344      311       33         0    77824        0             0 dhclient
[155991.609885] [    193]     0   193     3338      274      224       50         0    65536        0         -1000 sshd
[155991.610267] [    194]     0   194      718       78       32       46         0    49152        0             0 agetty
[155991.610674] [    195]     0   195      718       39        0       39         0    45056        0             0 agetty
[155991.611090] [    196]     0   196      718       58       32       26         0    40960        0             0 agetty
[155991.611489] [    197]     0   197      718       64       32       32         0    45056        0             0 agetty
[155991.611890] [    198]     0   198      718       46        0       46         0    40960        0             0 agetty
[155991.612285] [    199]     0   199      718       33        0       33         0    53248        0             0 agetty
[155991.612683] [    200]     0   200     2085      143      128       15         0    57344        0             0 login
[155991.613075] [    207]     0   207     1513      147       96       51         0    57344        0             0 bash
[155991.613462] [    216]     0   216  8669850  8385063  8385006       57         0 67465216        0             0 buzzer
[155991.613861] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,task=buzzer,pid=216,uid=0
[155991.614184] Out of memory: Killed process 216 (buzzer) total-vm:34679400kB, anon-rss:33540024kB, file-rss:228kB, shmem-rss:0kB, UID:0 pgtables:65884kB oom_score_adj:0
[155991.614868] buzzer (221) used greatest stack depth: 23896 bytes left
[155995.732880] oom_reaper: reaped process 216 (buzzer), now anon-rss:16kB, file-rss:228kB, shmem-rss:0kB
thatjiaozi commented 1 month ago

That is indeed a lot of memory usage. There must be a leak somewhere, thanks for flagging it, I will dive more into it.

In terms of finding bugs: Right now we only have two strategies, pointer arithmetic or coverage guided. Although the as is state of buzzer will likely not find bugs as a lot of mitigations have been put in place in the verifier, you might need to implement new features or modify the strategies (or write your own strategy) to find something.

The idea with buzzer was not necessarily to have something that finds vulns out of the box but rather give you the tools and examples on how to play with ebpf and write fuzzing strategies/test cases that you think can catch certain types of bugs.