anakryiko / retsnoop

Investigate kernel error call stacks
BSD 2-Clause "Simplified" License
186 stars 32 forks source link

How to get file info and line info ? #9

Closed chenhengqi closed 1 year ago

chenhengqi commented 2 years ago

I see in README, all output comes with file info and line info, this is very useful. Like:

                             entry_SYSCALL_64_after_hwframe+0x44       (arch/x86/entry/entry_64.S:112:0)
                             do_syscall_64+0x2d                        (arch/x86/entry/common.c:46:12)
           24us [-EINVAL]    __x64_sys_bpf+0x5                         (kernel/bpf/syscall.c:4351:1)
                             . __se_sys_bpf                            (kernel/bpf/syscall.c:4351:1)
                             __do_sys_bpf+0x5ca                        (kernel/bpf/syscall.c:4438:9)
                             . bpf_btf_load                            (kernel/bpf/syscall.c:3818:9)
           21us [-EINVAL]    btf_new_fd+0x213                          (kernel/bpf/btf.c:5639:8)
                             . btf_parse                               (kernel/bpf/btf.c:4246:8)
                             . btf_parse_type_sec                      (kernel/bpf/btf.c:4009:5)
            0us [-EINVAL]    btf_check_all_metas+0x5                   (kernel/bpf/btf.c:3856:1)

But when I play it locally, the stack traces do NOT contains file info and line info:

$ sudo retsnoop -e vfs_read -s -v
Using vmlinux image at /lib/modules/5.19.0-rc2+/build/vmlinux.
Discovered 53781 available kprobes!
Found 1 attachable functions in total.
Skipped 100236 functions in total.
Function 'vfs_read' is marked as an entry point.
Attached to function #1 'vfs_read'.
Total 1 kernel functions attached successfully!
Successfully attached in 12 ms.
Receiving data...
09:13:32.591 PID 14199 (sap1009):
                    entry_SYSCALL_compat_after_hwframe+0x45  
                    do_fast_syscall_32+0x34                  
                    __do_fast_syscall_32+0x8c                
                    __ia32_sys_read+0x18                     
                    ksys_read+0xb1                           
!    0us [-EAGAIN]  vfs_read                                 

^C
Detaching... DONE in 131 ms.

$ file /lib/modules/5.19.0-rc2+/build/vmlinux                                      
/lib/modules/5.19.0-rc2+/build/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=92bcafd537b953cc3d545520f8d607d4331250ca, with debug_info, not stripped

What am I missing here ?

anakryiko commented 2 years ago

@chenhengqi file path and line info information is coming from DWARF. If you kernel image doesn't have DWARF info recorded, you won't see file:line piece. If you have a version of vmlinux with DWARF embedded built separately, you can try passing it as -k argument

chenhengqi commented 2 years ago

The kernel is custom built by myself without strip.

$ file vmlinux
vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=92bcafd537b953cc3d545520f8d607d4331250ca, with debug_info, not stripped

Retry with -k, still not working.

anakryiko commented 2 years ago

@chenhengqi I'm sorry, I have no idea. Maybe you can debug it yourself? I'm a bit swamped with other stuff right now, so unlikely to get to this any time soon (especially that repro is on your side only).

chenhengqi commented 2 years ago

OK, will try to debug it.

anakryiko commented 2 years ago

@chenhengqi any updates? were you able to figure this out?

ismail commented 1 year ago

I see the same issue on Fedora Rawhide kernel 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.aarch64, one thing I see is that the vmlinux uses DWARF v5 debuginfo format, would that be an issue?

anakryiko commented 1 year ago

@ismail, @chenhengqi could it be that your kernel enables KASLR?

Check

$ zcat /proc/config.gz | grep RANDOMIZE_BASE

I filed https://github.com/anakryiko/retsnoop/issues/23 to keep track of this.

ismail commented 1 year ago

@ismail, @chenhengqi could it be that your kernel enables KASLR?

At least for me, yes:

❯ grep RANDOMIZE_BASE /boot/config-6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.aarch64
CONFIG_RANDOMIZE_BASE=y
chenhengqi commented 1 year ago

Sorry, I missed the notification here.

And my dev machine (running on the cloud) was destroyed, I can't verify that.

anakryiko commented 1 year ago

@ismail, @chenhengqi forgot to mention this here, but https://github.com/anakryiko/retsnoop/pull/30 should have solved this problem. Can you guys please confirm?

chenhengqi commented 1 year ago
$ cat /boot/config-6.1.0-rc4+| grep RANDOMIZE_BASE
CONFIG_RANDOMIZE_BASE=y

$ sudo ./retsnoop -e vfs_read -s -v
Using vmlinux image at /lib/modules/6.1.0-rc4+/build/vmlinux.
KASLR offset is 0x3a400000.
Sidecar PID is 51308.
Discovered 51753 available kprobes!
Found 1 attachable functions in total.
Skipped 105211 functions in total.
Function 'vfs_read' is marked as an entry point.
Attached to function #1 'vfs_read'.
Total 1 kernel functions attached successfully!
Successfully attached in 11 ms.
Receiving data...
17:03:18.004763 -> 17:03:18.004768 TID/PID 51305/51305 (sudo/sudo):

                    entry_SYSCALL_64_after_hwframe+0x63  (arch/x86/entry/entry_64.S:120:0)               
                    do_syscall_64+0x5c                   (do_syscall_x64 @ arch/x86/entry/common.c:50:12)
                    __x64_sys_read+0x19                  (fs/read_write.c:621:1)                         
                    ksys_read+0xb5                       (fs/read_write.c:613:9)                         
!    0us [-EAGAIN]  vfs_read                                                                             

^C17:03:18.010266 -> 17:03:22.121734 TID/PID 51308/51308 (4/4):
Failed to get symbolized function name: 0
SIGPIPE caught, exiting!

Going to close this, it works now.