Closed deepakpjose closed 3 months ago
In general blaze_err_str(blaze_err_last())
may help for understanding symbolization failures or you may want to look at the reason if individual addresses failed to symbolize. In this case, I am having trouble understanding how the code you referenced can match what you pasted output for, as there is no <no-symbol>
or some kind of @
construct after the user space addresses, which strikes me as impossible (though it's possible I missed something).
I've modified it. It's there, but somehow missed out.
Hi Daniel,
I've added blaze_err_str(blaze_err_last()) and the error was 'entity not found'. Looks like userspace address passed is not correct. I've to add any other settings to get proper address? Searches in different portals suggested to set 'sudo setcap -r /usr/local/bin/your_bpf_program'. But that didn't help.
COMM: bash (pid=3290) @ CPU 5
Kernel:
ffffffffa8406e45: __x64_sys_execve @ 0xffffffffa8406e40+0x5
ffffffffa91f2fc0: do_syscall_64 @ 0xffffffffa91f2f60+0x60
ffffffffa940012a: entry_SYSCALL_64_after_hwframe @ 0xffffffffa94000bc+0x6e
Userspace:
00007ff5f6e7ef3b: <no-symbol> blaze_error: entity not found
000055f4fa7cadc2: <no-symbol> blaze_error: entity not found
000055f4fa7cc1fa: <no-symbol> blaze_error: entity not found
000055f4fa7bee08: <no-symbol> blaze_error: entity not found
000055f4fa7c1c6e: <no-symbol> blaze_error: entity not found
000055f4fa7b410c: <no-symbol> blaze_error: entity not found
000055f4fa7a7efb: <no-symbol> blaze_error: entity not found
00007ff5f6dba1ca: <no-symbol> blaze_error: entity not found
00007ff5f6dba28b: <no-symbol> blaze_error: entity not found
000055f4fa7a8385: <no-symbol> blaze_error: entity not found
Thanks, Deepak
You can take the PID and addresses and feed them to blazecli
. It should be able to tell you more where the error originates and you can enable traces via -vvv
.
Binaries are available here for example.
Daniel, Thanks for the input in debugging this. I felt blazecli is unable to resolve symbols for addresses properly. ebpf is generating addressses properly because when I look the detailed memory map for the user process, it is giving an address in the range for text region.
Follwoing is the stack trace for user and kernel I got for one example.
root@975ef1935d58:/src/libbpf-bootstrap/build# ./profile
COMM: hello_world (pid=9865) @ CPU 2
Kernel:
ffffffffa8171a8e: queue_work_on @ 0xffffffffa8171a70+0x1e
ffffffffa8b97eaf: tty_insert_flip_string_and_push_buffer @ 0xffffffffa8b97e20+0x8f
ffffffffa8b9acbe: pty_write @ 0xffffffffa8b9ac90+0x2e
ffffffffa8b92372: n_tty_write @ 0xffffffffa8b92210+0x162
ffffffffa8b8daf5: file_tty_write.isra.0 @ 0xffffffffa8b8d970+0x185
ffffffffa83fc8fd: vfs_write @ 0xffffffffa83fc6d0+0x22d
ffffffffa83fceaf: ksys_write @ 0xffffffffa83fce40+0x6f
ffffffffa91f2fc0: do_syscall_64 @ 0xffffffffa91f2f60+0x60
ffffffffa940012a: entry_SYSCALL_64_after_hwframe @ 0xffffffffa94000bc+0x6e
Userspace:
00007ffff7ec2574: <no-symbol>
00007ffff7e38561: <no-symbol>
00007ffff7e39a9f: <no-symbol>
00007ffff7e06cb9: <no-symbol>
00007ffff7e11733: <no-symbol>
00007ffff7e061a3: <no-symbol>
000055555555516a: <no-symbol>
00005555555551b4: <no-symbol>
00007ffff7dd01ca: <no-symbol>
00007ffff7dd028b: <no-symbol>
0000555555555075: <no-symbol>
Output of blazecli is as follows with traces using -vvv. For all addresses belongs to hello_world program, blazecli always dumps __TMC_END__ as the symbol.
root@975ef1935d58:/src/libbpf-bootstrap/blazesym# ./target/debug/blazecli symbolize elf --path /src/libbpf-bootstrap/C/hello_world 00005555555551b4
-vvv
2024-07-06T09:25:43.140286Z INFO symbolize: new src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([0x5555555551b4])
2024-07-06T09:25:43.140451Z INFO symbolize: enter src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([0x5555555551b4])
2024-07-06T09:25:43.141082Z INFO symbolize:symbolize_with_resolver: new src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([0x5555555
551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world)
2024-07-06T09:25:43.141315Z INFO symbolize:symbolize_with_resolver: enter src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([0x55555
55551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world)
2024-07-06T09:25:43.141474Z INFO symbolize:symbolize_with_resolver:find_sym: new src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([
0x5555555551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world) self=DWARF /src/libbpf-bootstrap/C/hello_world opts
=CodeInfoAndInlined addr=0x5555555551b4
2024-07-06T09:25:43.141593Z INFO symbolize:symbolize_with_resolver:find_sym: enter src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset
([0x5555555551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world) self=DWARF /src/libbpf-bootstrap/C/hello_world op
ts=CodeInfoAndInlined addr=0x5555555551b4
2024-07-06T09:25:43.141833Z INFO symbolize:symbolize_with_resolver:find_sym: exit src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset(
[0x5555555551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world) self=DWARF /src/libbpf-bootstrap/C/hello_world opt
s=CodeInfoAndInlined addr=0x5555555551b4
2024-07-06T09:25:43.141938Z INFO symbolize:symbolize_with_resolver:find_sym: close time.busy=240µs time.idle=224µs src=Elf("/src/libbpf-bootstrap/
C/hello_world") addrs=VirtOffset([0x5555555551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world) self=DWARF /src/l
ibbpf-bootstrap/C/hello_world opts=CodeInfoAndInlined addr=0x5555555551b4
2024-07-06T09:25:43.142076Z INFO symbolize:symbolize_with_resolver: exit src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([0x555555
5551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world)
2024-07-06T09:25:43.142166Z INFO symbolize:symbolize_with_resolver: close time.busy=763µs time.idle=323µs src=Elf("/src/libbpf-bootstrap/C/hello_w
orld") addrs=VirtOffset([0x5555555551b4]) addr=0x5555555551b4 resolver=Cached(DWARF /src/libbpf-bootstrap/C/hello_world)
2024-07-06T09:25:43.142285Z INFO symbolize: exit src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset([0x5555555551b4])
2024-07-06T09:25:43.142423Z INFO symbolize: close time.busy=1.83ms time.idle=307µs src=Elf("/src/libbpf-bootstrap/C/hello_world") addrs=VirtOffset
([0x5555555551b4])
0x005555555551b4: __TMC_END__ @ 0x4018+0x55555555119c
Symbols output for functions in readelf programs are as follows.
root@975ef1935d58:/src/libbpf-bootstrap/blazesym# readelf -s /src/libbpf-bootstrap/C/hello_world | grep FUNC
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _[...]@GLIBC_2.34 (2)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND [...]@GLIBC_2.2.5 (3)
6: 0000000000000000 0 FUNC WEAK DEFAULT UND [...]@GLIBC_2.2.5 (3)
4: 0000000000001080 0 FUNC LOCAL DEFAULT 15 deregister_tm_clones
5: 00000000000010b0 0 FUNC LOCAL DEFAULT 15 register_tm_clones
6: 00000000000010f0 0 FUNC LOCAL DEFAULT 15 __do_global_dtors_aux
9: 0000000000001130 0 FUNC LOCAL DEFAULT 15 frame_dummy
18: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_mai[...]
22: 00000000000011bc 0 FUNC GLOBAL HIDDEN 16 _fini
23: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5
29: 0000000000001050 38 FUNC GLOBAL DEFAULT 15 _start
30: 0000000000001140 84 FUNC GLOBAL DEFAULT 15 print_hello
32: 00000000000011a0 28 FUNC GLOBAL DEFAULT 15 main
35: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@G[...]
36: 0000000000001000 0 FUNC GLOBAL HIDDEN 12 _init
Detailed map of memory regions are as follows for my process that is profiled using ebpf.
root@975ef1935d58:/src/libbpf-bootstrap/blazesym# pidof hello_world
4139
root@975ef1935d58:/src/libbpf-bootstrap/blazesym# cat /proc/4139/maps
555555554000-555555555000 r--p 00000000 fe:01 2284602 /src/libbpf-bootstrap/C/hello_world
555555555000-555555556000 r-xp 00001000 fe:01 2284602 /src/libbpf-bootstrap/C/hello_world
555555556000-555555557000 r--p 00002000 fe:01 2284602 /src/libbpf-bootstrap/C/hello_world
555555557000-555555558000 r--p 00002000 fe:01 2284602 /src/libbpf-bootstrap/C/hello_world
555555558000-555555559000 rw-p 00003000 fe:01 2284602 /src/libbpf-bootstrap/C/hello_world
555555559000-55555557a000 rw-p 00000000 00:00 0 [heap]
7ffff7da3000-7ffff7da6000 rw-p 00000000 00:00 0
7ffff7da6000-7ffff7dce000 r--p 00000000 fe:01 2011576 /usr/lib/x86_64-linux-gnu/libc.so.6
7ffff7dce000-7ffff7f56000 r-xp 00028000 fe:01 2011576 /usr/lib/x86_64-linux-gnu/libc.so.6
7ffff7f56000-7ffff7fa5000 r--p 001b0000 fe:01 2011576 /usr/lib/x86_64-linux-gnu/libc.so.6
7ffff7fa5000-7ffff7fa9000 r--p 001fe000 fe:01 2011576 /usr/lib/x86_64-linux-gnu/libc.so.6
7ffff7fa9000-7ffff7fab000 rw-p 00202000 fe:01 2011576 /usr/lib/x86_64-linux-gnu/libc.so.6
7ffff7fab000-7ffff7fb8000 rw-p 00000000 00:00 0
7ffff7fbd000-7ffff7fbf000 rw-p 00000000 00:00 0
7ffff7fbf000-7ffff7fc3000 r--p 00000000 00:00 0 [vvar]
7ffff7fc3000-7ffff7fc5000 r-xp 00000000 00:00 0 [vdso]
7ffff7fc5000-7ffff7fc6000 r--p 00000000 fe:01 2011556 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffff7fc6000-7ffff7ff1000 r-xp 00001000 fe:01 2011556 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffff7ff1000-7ffff7ffb000 r--p 0002c000 fe:01 2011556 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffff7ffb000-7ffff7ffd000 r--p 00036000 fe:01 2011556 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffff7ffd000-7ffff7fff000 rw-p 00038000 fe:01 2011556 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Thanks, Deepak
You will have to use process symbolization:
$ ./target/debug/blazecli symbolize process --pid 4139 00005555555551b4 -vvv
(or something like that)
Can you please attach the hello_world
binary?
Here is my binary file for hello_world. https://github.com/deepakpjose/linux_commands/blob/main/hello_world.bin
process symbolization didn't work for me.
root@90a68567a85f:/src/libbpf-bootstrap/blazesym# ./target/debug/blazecli symbolize process --pid 51 00005555555551b4 -vvv
2024-07-08T17:11:00.881622Z INFO symbolize: new src=Process(51) addrs=AbsAddr([0x5555555551b4])
2024-07-08T17:11:00.881669Z INFO symbolize: enter src=Process(51) addrs=AbsAddr([0x5555555551b4])
2024-07-08T17:11:00.882945Z INFO symbolize:handle_unknown_addr: new src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0x5555555551b4
2024-07-08T17:11:00.883088Z INFO symbolize:handle_unknown_addr: enter src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0x5555555551b4
2024-07-08T17:11:00.883174Z INFO symbolize:handle_unknown_addr: exit src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0x5555555551b4
2024-07-08T17:11:00.883210Z INFO symbolize:handle_unknown_addr: close time.busy=86.4µs time.idle=182µs src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0
x5555555551b4
2024-07-08T17:11:00.883299Z INFO symbolize: exit src=Process(51) addrs=AbsAddr([0x5555555551b4])
2024-07-08T17:11:00.883348Z INFO symbolize: close time.busy=1.63ms time.idle=111µs src=Process(51) addrs=AbsAddr([0x5555555551b4])
0x005555555551b4: <no-symbol>
If I give absolute address, blazecli is dumping address properly. For example, in the read_elf from previous comment, hello_world address is 0000000000001140. So following cli works.
root@90a68567a85f:/src/libbpf-bootstrap/blazesym# ./target/debug/blazecli symbolize elf --path /src/libbpf-bootstrap/C/hello_world 0000000000001140
0x00000000001140: print_hello @ 0x1140+0x0 hello_world.c:4
My hello_world program looks like as follows.
1 #include <stdio.h>
2
3 void print_hello(void)
4 {
5 for (int i=0; i<100; i++) {
6 printf("Hello, World! = %d\n", i);
7 if (i == 99) {
8 i = 0;
9 }
10 }
11 }
12 int main() {
13 print_hello();
14 return 0;
15 }
Thanks, Deepak
process symbolization didn't work for me.
root@90a68567a85f:/src/libbpf-bootstrap/blazesym# ./target/debug/blazecli symbolize process --pid 51 00005555555551b4 -vvv 2024-07-08T17:11:00.881622Z INFO symbolize: new src=Process(51) addrs=AbsAddr([0x5555555551b4]) 2024-07-08T17:11:00.881669Z INFO symbolize: enter src=Process(51) addrs=AbsAddr([0x5555555551b4]) 2024-07-08T17:11:00.882945Z INFO symbolize:handle_unknown_addr: new src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0x5555555551b4 2024-07-08T17:11:00.883088Z INFO symbolize:handle_unknown_addr: enter src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0x5555555551b4 2024-07-08T17:11:00.883174Z INFO symbolize:handle_unknown_addr: exit src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0x5555555551b4 2024-07-08T17:11:00.883210Z INFO symbolize:handle_unknown_addr: close time.busy=86.4µs time.idle=182µs src=Process(51) addrs=AbsAddr([0x5555555551b4]) addr=0 x5555555551b4 2024-07-08T17:11:00.883299Z INFO symbolize: exit src=Process(51) addrs=AbsAddr([0x5555555551b4]) 2024-07-08T17:11:00.883348Z INFO symbolize: close time.busy=1.63ms time.idle=111µs src=Process(51) addrs=AbsAddr([0x5555555551b4]) 0x005555555551b4: <no-symbol>
Okay, but did you check whether the address actually exists for this process? You have address space randomization enabled after all.
If I symbolize your binary I seem to get the expected result:
$ cargo run -p blazecli -- symbolize process --pid 87803 0x55a4864a01b4
> 0x0055a4864a01b4: main @ 0x11a0+0x14 hello_world.c:14:5
Using the cli you have shared, it's working for me. Thanks for that.
By any chance, bpf_get_current_pid_tgid can give wrong process id? I'm using the profile.bpf.c from libpf-bootstrap in my docker, but it's giving wrong process id. If we see below, process 'hello_world' pid is 1606. But user process profile.c is getting 2581. Probably blazesym failed when called blaze_symbolize_process_abs_addrs is because pid passed from profile.bpf.c is wrong.
root@d07ed6c244d8:/src/libbpf-bootstrap/build# pidof hello_world
1606
root@d07ed6c244d8:/src/libbpf-bootstrap/build# ./profile
COMM: hello_world (pid=2581) @ CPU 1
Kernel:
ffffffffa9a1994b: copyin @ 0xffffffffa9a19930+0x1b
ffffffffa9a1b39b: _copy_from_iter @ 0xffffffffa9a1b340+0x5b
ffffffffa9b8dace: file_tty_write.isra.0 @ 0xffffffffa9b8d970+0x15e
ffffffffa93fc8fd: vfs_write @ 0xffffffffa93fc6d0+0x22d
ffffffffa93fceaf: ksys_write @ 0xffffffffa93fce40+0x6f
ffffffffaa1f2fc0: do_syscall_64 @ 0xffffffffaa1f2f60+0x60
ffffffffaa40012a: entry_SYSCALL_64_after_hwframe @ 0xffffffffaa4000bc+0x6e
Userspace:
00007ffff7ec2574: <no-symbol>
00007ffff7e38561: <no-symbol>
00007ffff7e39a9f: <no-symbol>
00007ffff7e06cb9: <no-symbol>
00007ffff7e11733: <no-symbol>
00007ffff7e061a3: <no-symbol>
000055555555516a: <no-symbol>
00005555555551b4: <no-symbol>
00007ffff7dd01ca: <no-symbol>
00007ffff7dd028b: <no-symbol>
0000555555555075: <no-symbol>
Snippet from file profile.bpf.c
SEC("ksyscall/execve")
int profile(void *ctx)
{
int pid = bpf_get_current_pid_tgid() >> 32;
int cpu_id = bpf_get_smp_processor_id();
struct stacktrace_event *event;
int cp;
event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
if (!event)
return 1;
event->pid = pid;
event->cpu_id = cpu_id;
Also, can you teach me, how you decoded binary and found about ASLR? Like to learn tricks of parsing the binary.
Thanks, Deepak
By any chance, bpf_get_current_pid_tgid can give wrong process id?
I doubt it will be wrong. It's just returning some kernel task struct member.
I'm using the profile.bpf.c from libpf-bootstrap in my docker, but it's giving wrong process id. If we see below, process 'hello_world' pid is 1606. But user process profile.c is getting 2581.
What I could imagine happening is that Docker is setting up a PID namespace. As far as I understand the PID reported by BPF will be the kernel's "global" one.
Are you invoking profile
from within the container? I think that could explain the issue. Then, the program by design would not have access to the global PID namespace but the global PID is still being reported by BPF. Yet, this global PID has no meaning within the namespaced container.
profile
, being a system-wide profiler, shouldn't be invoked in such a namespace. I suppose we may want to guard against that to prevent confusion.
Also, can you teach me, how you decoded binary and found about ASLR? Like to learn tricks of parsing the binary.
I just noticed that addresses were different between runs by looking at the corresponding maps
file contents. What I did then was take the 0x5555555551b4
, which maps to the
555555555000-555555556000 r-xp 00001000 fe:01 2284602 /src/libbpf-bootstrap/C/hello_world
entry you pasted. So then I subtracted 0x555555555000
from it. That's 0x1b4
. Then, on my running instance of hello_world
I searched for the equivalent maps
entry for the same segment, which started at 0x0055a4864a0000
. So I added the 0x1b4
offset to it and used that as input for blazecli
. It's not strictly speaking the correct algorithm, but it works most of the time.
I am running profile ebpf process from docker. Are you saying, ebpf is reporting process id from macos(host os) Point of view?
In place of perf_on_event, if trace some other system call, pid is going to be same?
I am running profile ebpf process from docker. Are you saying, ebpf is reporting process id from macos(host os) Point of view?
Yep.
In place of perf_on_event, if trace some other system call, pid is going to be same?
I don't know.
Thanks a lot for support. It was a wonderful learning. I'll close this request.
Hi Team,
I've modified profile.c in libbpf-bootstrap to listen to execve and print backtrace. While running it, I can see that kernel stack trace is getting dumped. But blazesym is unable to dump userstack trace. Can some one help me the next steps to know, why symbols are not dumped for user space addresses? Is it some issue with blazesym?
Code for profile.c and profile.bpf.c I compiled can be found in following location. https://github.com/deepakpjose/linux_commands/blob/main/profile.c
Following is the output I'm getting when I dump 'ls' from my docker container.
Environment(inside docker): root@2f51403077e9:/src/libbpf-bootstrap/examples/c# uname -a Linux 2f51403077e9 6.6.22-linuxkit #1 SMP PREEMPT_DYNAMIC Fri Mar 29 12:23:08 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Thanks, Deepak