iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.28k stars 3.84k forks source link

broken ksnoop tool #4746

Open dubeyabhishek opened 12 months ago

dubeyabhishek commented 12 months ago

Not getting any output for ksnoop tool. Tried multiple functions in argument, no error/failed message is appearing. Example run below:

./ksnoop trace "vma_merge(mm)" -d

Debug: Parsing trace 'vma_merge(mm)' Debug: got func 'vma_merge', args 'mm' Debug: vma_merge = <ip c00000000046df10, mod vmlinux> Debug: getting BTF for vmlinux libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0 Debug: arg #1: <name 'vmi', type id '9646'> Debug: arg #2: <name 'mm', type id '415'> Debug: arg #3: <name 'prev', type id '496'> Debug: arg #4: <name 'addr', type id '1'> Debug: arg #5: <name 'end', type id '1'> Debug: return value: type id '496'> Debug: 'mm' arg, predcate '(null)' Debug: setting base arg for val mm to 1 Debug: custom trace with 1 args libbpf: loading object 'ksnoop_bpf' from buffer libbpf: elf: section(2) .text, size 5704, link 0, flags 6, type=1 libbpf: sec '.text': found program 'ksnoop' at insn offset 0 (0 bytes), code size 713 insns (5704 bytes) libbpf: elf: section(3) .rel.text, size 176, link 15, flags 40, type=9 libbpf: elf: section(4) kprobe/foo, size 32, link 0, flags 6, type=1 libbpf: sec 'kprobe/foo': found program 'kprobe_entry' at insn offset 0 (0 bytes), code size 4 insns (32 bytes) libbpf: elf: section(5) .relkprobe/foo, size 16, link 15, flags 40, type=9 libbpf: elf: section(6) kretprobe/foo, size 32, link 0, flags 6, type=1 libbpf: sec 'kretprobe/foo': found program 'kprobe_return' at insn offset 0 (0 bytes), code size 4 insns (32 bytes) libbpf: elf: section(7) .relkretprobe/foo, size 16, link 15, flags 40, type=9 libbpf: elf: section(8) license, size 13, link 0, flags 3, type=1 libbpf: license of ksnoop_bpf is Dual BSD/GPL libbpf: elf: section(9) .maps, size 88, link 0, flags 3, type=1 libbpf: elf: section(10) .BTF, size 8893, link 0, flags 0, type=1 libbpf: elf: section(12) .BTF.ext, size 4572, link 0, flags 0, type=1 libbpf: elf: section(15) .symtab, size 1920, link 1, flags 0, type=2 libbpf: looking for externs among 80 symbols... libbpf: collected 0 externs total libbpf: map 'ksnoop_func_stack': at sec_idx 9, offset 0. libbpf: map 'ksnoop_func_stack': found type = 1. libbpf: map 'ksnoop_func_stack': found key [8], sz = 8. libbpf: map 'ksnoop_func_stack': found value [11], sz = 144. libbpf: map 'ksnoop_func_stack': found max_entries = 2048. libbpf: map 'ksnoop_func_map': at sec_idx 9, offset 32. libbpf: map 'ksnoop_func_map': found type = 5. libbpf: map 'ksnoop_func_map': found key [8], sz = 8. libbpf: map 'ksnoop_func_map': found value [22], sz = 16296. libbpf: map 'ksnoop_func_map': found max_entries = 64. libbpf: map 'ksnoop_perf_map': at sec_idx 9, offset 64. libbpf: map 'ksnoop_perf_map': found type = 4. libbpf: map 'ksnoop_perf_map': found key_size = 4. libbpf: map 'ksnoop_perf_map': found value_size = 4. libbpf: sec '.rel.text': collecting relocation for section(2) '.text' libbpf: sec '.rel.text': relo #0: insn #6 against 'ksnoop_func_stack' libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #6 libbpf: sec '.rel.text': relo #1: insn #35 against 'ksnoop_func_stack' libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #35 libbpf: sec '.rel.text': relo #2: insn #40 against 'ksnoop_func_stack' libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #40 libbpf: sec '.rel.text': relo #3: insn #85 against 'ksnoop_func_map' libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #85 libbpf: sec '.rel.text': relo #4: insn #146 against 'ksnoop_func_map' libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #146 libbpf: sec '.rel.text': relo #5: insn #490 against 'ksnoop_func_stack' libbpf: prog 'ksnoop': found map 0 (ksnoop_func_stack, sec 9, off 0) for insn #490 libbpf: sec '.rel.text': relo #6: insn #540 against 'ksnoop_func_map' libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #540 libbpf: sec '.rel.text': relo #7: insn #568 against 'ksnoop_perf_map' libbpf: prog 'ksnoop': found map 2 (ksnoop_perf_map, sec 9, off 64) for insn #568 libbpf: sec '.rel.text': relo #8: insn #584 against 'ksnoop_perf_map' libbpf: prog 'ksnoop': found map 2 (ksnoop_perf_map, sec 9, off 64) for insn #584 libbpf: sec '.rel.text': relo #9: insn #635 against 'ksnoop_func_map' libbpf: prog 'ksnoop': found map 1 (ksnoop_func_map, sec 9, off 32) for insn #635 libbpf: sec '.rel.text': relo #10: insn #665 against 'ksnoop_perf_map' libbpf: prog 'ksnoop': found map 2 (ksnoop_perf_map, sec 9, off 64) for insn #665 libbpf: sec '.relkprobe/foo': collecting relocation for section(4) 'kprobe/foo' libbpf: sec '.relkprobe/foo': relo #0: insn #1 against '.text' libbpf: sec '.relkretprobe/foo': collecting relocation for section(6) 'kretprobe/foo' libbpf: sec '.relkretprobe/foo': relo #0: insn #1 against '.text' libbpf: loading kernel BTF '/sys/kernel/btf/vmlinux': 0 libbpf: map 'ksnoop_func_stack': created successfully, fd=4 libbpf: map 'ksnoop_func_map': created successfully, fd=5 libbpf: map 'ksnoop_perf_map': setting size to 8 libbpf: map 'ksnoop_perf_map': created successfully, fd=6 libbpf: sec '.text': found 7 CO-RE relocations libbpf: CO-RE relocating [70] struct pt_regs: found target candidate [111] struct pt_regs in [vmlinux] libbpf: prog 'ksnoop': relo #0: [70] struct pt_regs.nip (0:0:1:1 @ offset 256) libbpf: prog 'ksnoop': relo #0: matching candidate #0 [111] struct pt_regs.nip (0:0:1:1 @ offset 256) libbpf: prog 'ksnoop': relo #0: patched insn #50 (ALU/ALU64) imm 256 -> 256 libbpf: prog 'ksnoop': relo #1: [70] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24) libbpf: prog 'ksnoop': relo #1: matching candidate #0 [111] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24) libbpf: prog 'ksnoop': relo #1: patched insn #222 (ALU/ALU64) imm 24 -> 24 libbpf: prog 'ksnoop': relo #2: [70] struct pt_regs.gpr[4] (0:0:1:0:4 @ offset 32) libbpf: prog 'ksnoop': relo #2: matching candidate #0 [111] struct pt_regs.gpr[4] (0:0:1:0:4 @ offset 32) libbpf: prog 'ksnoop': relo #2: patched insn #226 (ALU/ALU64) imm 32 -> 32 libbpf: prog 'ksnoop': relo #3: [70] struct pt_regs.gpr[5] (0:0:1:0:5 @ offset 40) libbpf: prog 'ksnoop': relo #3: matching candidate #0 [111] struct pt_regs.gpr[5] (0:0:1:0:5 @ offset 40) libbpf: prog 'ksnoop': relo #3: patched insn #230 (ALU/ALU64) imm 40 -> 40 libbpf: prog 'ksnoop': relo #4: [70] struct pt_regs.gpr[6] (0:0:1:0:6 @ offset 48) libbpf: prog 'ksnoop': relo #4: matching candidate #0 [111] struct pt_regs.gpr[6] (0:0:1:0:6 @ offset 48) libbpf: prog 'ksnoop': relo #4: patched insn #234 (ALU/ALU64) imm 48 -> 48 libbpf: prog 'ksnoop': relo #5: [70] struct pt_regs.gpr[7] (0:0:1:0:7 @ offset 56) libbpf: prog 'ksnoop': relo #5: matching candidate #0 [111] struct pt_regs.gpr[7] (0:0:1:0:7 @ offset 56) libbpf: prog 'ksnoop': relo #5: patched insn #238 (ALU/ALU64) imm 56 -> 56 libbpf: prog 'ksnoop': relo #6: [70] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24) libbpf: prog 'ksnoop': relo #6: matching candidate #0 [111] struct pt_regs.gpr[3] (0:0:1:0:3 @ offset 24) libbpf: prog 'ksnoop': relo #6: patched insn #294 (ALU/ALU64) imm 24 -> 24 libbpf: prog 'kprobe_entry': added 713 insns from sub-prog 'ksnoop' libbpf: prog 'kprobe_entry': insn #1 relocated, imm 2 points to subprog 'ksnoop' (now at 4 offset) libbpf: prog 'kprobe_return': added 713 insns from sub-prog 'ksnoop' libbpf: prog 'kprobe_return': insn #1 relocated, imm 2 points to subprog 'ksnoop' (now at 4 offset) Debug: Attached kprobe for 'vma_merge' Debug: Attached kretprobe for 'vma_merge' TIME CPU PID FUNCTION/ARGS

matthew-olson-intel commented 11 months ago

I've reproduced this issue: same symptom, no output, even when attaching to the same function with bpftrace works. Diagnosing.

dubeyabhishek commented 11 months ago

In my investigation KSNOOP_IP_FIX macro is causing the issue. Kprobes are preferably attached at ftrace locations. Since, this location might depend on the stack configurations(CFA/CFI etc) so we might not be lucky enough everytime to have exact address match in ksnoop_func_map.

matthew-olson-intel commented 11 months ago

I agree; KSNOOP_IP_FIX is definitely related to the problem. On my system, vma_merge is at 0xffffffffb754b5e0, and KSNOOP_IP_FIX only subtracts one byte from 0xffffffffb754b5e5; thus, we're 4 bytes off. We should definitely be doing something more robust than that.

I think in general, we've got two options for fixing ksnoop:

  1. Fix KSNOOP_IP_FIX so that it works in more scenarios.
  2. Make ksnoop attach to tracepoints instead of using kprobes. Most functions that one would want to trace have a tracepoint anyway, so this might be a good compromise.
matthew-olson-intel commented 11 months ago

In this case, a quick hack could be to force 16-byte alignment in KSNOOP_IP_FIX (instead of subtracting a number of bytes, which depends on the kernel configuration). This would only be for x86_64, and I'm not sure how many other kernel configurations that hack would work in, though.

matthew-olson-intel commented 11 months ago

Ideally, we'd handle this by doing PT_REGS_IP_CORE(ctx) - MCOUNT_INSN_SIZE.

chenhengqi commented 11 months ago

cc @alan-maguire

alan-maguire commented 11 months ago

seems like we should probably also use bpf_get_func_ip() where available, only falling back to KSNOOP_IP_FIX if that doesn't exist (it didn't when ksnoop originally landed). I also wonder if we can use an CO-RE tricks to infer MCOUNT_INSN_SIZE from a data structure that uses it?

alan-maguire commented 11 months ago

something like

if (bpf_core_enum_value_exists(enum bpf_func_id, BPF_FUNC_get_func_ip)) ip = bpf_get_func_ip(); else ip = KSNOOP_IP_FIX(ctx);

(need to figure out the exact macro for KSNOOP_IP_FIX(), but the above should solve the issue for newer kernels (>5.15 ish) at least..

alan-maguire commented 11 months ago

can you try attached patch and see if it helps? (I think it will help for kernels 5.13 and newer). it uses bpf_get_func_ip() where available.. 0001-ksnoop-use-bpf_get_func_ip-where-helper-is-available.patch.txt

matthew-olson-intel commented 11 months ago

Yeah, @alan-maguire, that patch works. Thanks!

alan-maguire commented 11 months ago

Great, thanks for the quick response! So I can file a PR using that fix for now, and we can think about the best solution for the pre-5.13 case. I suggest we leave this issue open regardless so we can track the pre-5.13 aspect too, since it's not really fixed until it's fixed for that case as well. What do you think?

matthew-olson-intel commented 11 months ago

Yeah, just tested in a 5.4.0-137-generic kernel in Ubuntu 18.04 (I had a VM handy), and the same issue occurs; to fully fix this problem, we definitely need to do something hacky to account for the MCOUNT_INSN_SIZE offset.

alan-maguire commented 11 months ago

okay sad to say i couldn't come up with anything better than the attached patch. the idea is this - we do a one-off serach for MCOUNT_INSN_SIZE by continually subtracting a byte from the PT_REGS_IP_CORE() instruction pointer value and checking if our map lookup succeeds. When/if it does, we cache the mcount_insn_size value, so it's a one-off cost. 0001-ksnoop-dynamically-determine-MCOUNT_INSN_SIZE-adjust.patch.txt

If you get a chance to give it a spin on your 5.4 kernel that would be great. thanks!

chenhengqi commented 11 months ago

Can't we just #define MCOUNT_INSN_SIZE for supported architectures ?

alan-maguire commented 11 months ago

would certainly be easier! however it's somewhat config-dependent on x86 at least; if CONFIG_FUNCTION_TRACER is defined, it's 5, otherwise 0. there may be other cases for other arches too, haven't looked yet.

matthew-olson-intel commented 11 months ago

I tested that new patch on 5.4, and while it does find the proper IP and correctly identifies MCOUNT_INSN_SIZE as 1 in this case, it doesn't seem to be giving me any output-- further debugging is needed. The map lookup is definitely succeeding, get_trace returns a valid pointer, and I get to the return 0 at the end of the ksnoop BPF function.

alan-maguire commented 11 months ago

i haven't been able to reproduce this yet; if you get a chance would you mind attaching the 5.4 kernel config where you see this failure? thanks!

matthew-olson-intel commented 11 months ago

Yep, it's just the standard Ubuntu 5.4.0-137-generic config.

config-5.4.0-137-generic.txt

chenhengqi commented 11 months ago

would certainly be easier! however it's somewhat config-dependent on x86 at least; if CONFIG_FUNCTION_TRACER is defined, it's 5, otherwise 0. there may be other cases for other arches too, haven't looked yet.

Could we employ CO-RE kconfig here ?

alan-maguire commented 11 months ago

we probably could, though it might get a bit messy depending on how things look on other architectures. first order of business is to get a fix that works for the 5.4 case though; the above one doesn't and i'm not sure why yet, so i'll try and reproduce that failure now..

alan-maguire commented 11 months ago

Apologies for the delay, but I got set up with Ubuntu 18.04:

Linux bpfubuntu 5.4.0-1100-oracle #109~18.04.1-Ubuntu SMP Fri Apr 21 12:01:52 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

When I run "ksnoop vma_merge" I see it working:

ubuntu@bpfubuntu:~/src/bcc/libbpf-tools$ sudo ksnoop vma_merge TIME CPU PID FUNCTION/ARGS 67702166171718 0 2565 vma_merge( mm = (0xffff938366cce200) (struct mm_struct){ (struct){ .mmap = (struct vm_area_struct )0xffff9380437259c0, .mm_rb = (struct rb_root){ .rb_node = (struct rb_node )0xffff938043725020, }, .vmacache_seqnum = (u64)12, .get_unmapped_area = (long unsigned int ()(struct file *, long unsigned int, long unsigned int, long unsigned int, long unsigned int))0xffffffffad639110, .mmap_base = (long unsigned int)140389935984640, .mmap_legacy_base = (long unsigned int)47242859184128, .mmap_compat_base = (long unsigned int)4159741952, .mmap_compat_legacy_base = (long unsigned int)1432653824, .task_size = (long unsigned int)140737488351232, .highest_vm_end = (long unsigned int)140735409319936,

I don't see anything significant in the config diff between mine and yours config.diff.txt either which makes it even more of a puzzle..

matthew-olson-intel commented 11 months ago

I just tried again this morning (in the same VM), and it's working. In previous attempts, I waited 2 or so minutes (left to get a cup of coffee, etc.), so figured it wasn't working because other VMs had given me samples immediately. However, this morning I left it running for a few more minutes and got a couple of samples in the 5.4 kernel.

I'm sorry about that; I should have left it running for a longer period of time to be really sure it wasn't working. Perhaps in recompense, I can try running it in a BTF-enabled 4.18 kernel to be really sure.

But it looks like everything's all fixed now!

alan-maguire commented 11 months ago

no apology necessary! I saw nothing and assumed it was broken for me too. Then triggering some activity led to seeing output. However here's the thing - I didn't have the proposed fix in place, I had built master branch bcc to get a baseline. Without the proposed patch (just using the master branch of bcc) do you see ksnoop output?

matthew-olson-intel commented 11 months ago

Yes -- it seems that KSNOOP_IP_FIX does actually work on the 5.4 kernel; I just hadn't actually triggered a vma_merge. I think I just got lucky when I tried it on other idle VMs!

Seems everything's fixed now, then. Case closed!