Closed dtcccc closed 2 months ago
I have observed the same issue. It seems that commit https://github.com/iovisor/ubpf/commit/ff4b48a8a6f062955e3f49cd9dba54624173a974 emits incorrect JIT code, leading to program crashes.
@xfoukas @dtcccc can you please share which ABI this is? Linux, Windows, or something else?
@Alan-Jowett, in my case this is the Linux ABI
Thanks. I am not seeing anything obvious from the code. As near as I can tell, r12 is a callee saved register (aka non-volatile) in the Linux x86-64 ABI, so the mapping from BPF_REG7 -> r12 should be fine.
Would it be possible to share the BPF byte code and the generated x64 code to better understand if the JIT is wrong?
Thank you @dtcccc and @xfoukas ! As @Alan-Jowett said, we are working on understanding the issue and fixing it. If you could provide the BPF byte code and generated x64 that would really help!
Thank you again for the report and for being users!
Here is my simple step:
prepare a file named "bpf.s"
.text
.file "test.bpf.c"
.globl func # -- Begin function func
.p2align 3
.type func,@function
func: # @func
# %bb.0:
r7 = r1
r0 = *(u64 *)(r7 + 0)
exit
.Lfunc_end0:
.size func, .Lfunc_end0-func
# -- End function
.addrsig
Then compile it by llvm-mc -triple bpf -filetype=obj -o bpf.o bpf.s
We can see the bpf byte code by llvm-objdump:
bpf.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000
4. load it `./build/bin/ubpf_test -j -m any_input_file bpf.o`
and we will receive a core dump
5. find jit code in the core dump
0x7fb4a9e61000: push %rbp
0x7fb4a9e61001: push %rbx
0x7fb4a9e61002: push %r12
0x7fb4a9e61004: push %r13
0x7fb4a9e61006: push %r14
0x7fb4a9e61008: push %r15
0x7fb4a9e6100a: mov %rdi,%r11
0x7fb4a9e6100d: sub $0x8,%rsp
0x7fb4a9e61014: mov %rsp,%rbp
0x7fb4a9e61017: mov %rsp,%r15
0x7fb4a9e6101a: sub $0x200,%rsp
0x7fb4a9e61021: callq 0x7fb4a9e6102b
0x7fb4a9e61026: jmpq 0x7fb4a9e61048
0x7fb4a9e6102b: sub $0x8,%rsp
0x7fb4a9e61032: movq $0x20,(%rsp)
0x7fb4a9e6103a: mov %rdi,%r12
=> 0x7fb4a9e6103d: mov (%r8,%rcx,2),%rax
0x7fb4a9e61041: add $0x8,%esp
0x7fb4a9e61047: retq
0x7fb4a9e61048: mov %rbp,%rsp
0x7fb4a9e6104b: add $0x8,%rsp
0x7fb4a9e61052: pop %r15
0x7fb4a9e61054: pop %r14
0x7fb4a9e61056: pop %r13
0x7fb4a9e61058: pop %r12
0x7fb4a9e6105a: pop %rbx
0x7fb4a9e6105b: pop %rbp
0x7fb4a9e6105c: retq
0x7fb4a9e6105d: callq 0x7fb4a9e61069
0x7fb4a9e61062: pause
0x7fb4a9e61064: jmpq 0x7fb4a9e61000
0x7fb4a9e61069: mov %rax,(%rsp)
0x7fb4a9e6106d: retq
That is really helpful! Thank you! I think that what you are showing is what @Alan-Jowett and I supposed was the problem. We discussed (offline) the way to fix the problem and I am working as quickly and diligently as possible to correct the problem.
@dtcccc I tried to tag you in a comment on e307c30bea2adb39399d54a3ebb8ed6fe67e8d9b but I think that I screwed up. So, I am tagging you here, too. Sorry if you got multiple notices.
@hawkinsw , @Alan-Jowett , below is a sample code that leads to a segfault in my case.
This is the bpf bytecode of my program, as obtained by llvm-objdump:
Disassembly of section janus_generic:
0000000000000000 <janus_main>:
0: b7 01 00 00 00 00 00 00 r1 = 0
1: 63 1a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r1
2: bf a2 00 00 00 00 00 00 r2 = r10
3: 07 02 00 00 fc ff ff ff r2 += -4
4: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
6: 85 00 00 00 01 00 00 00 call 1
7: bf 07 00 00 00 00 00 00 r7 = r0
8: b7 06 00 00 01 00 00 00 r6 = 1
9: 15 07 0e 00 00 00 00 00 if r7 == 0 goto +14 <LBB0_3>
10: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
12: 85 00 00 00 11 00 00 00 call 17
13: 15 00 0a 00 00 00 00 00 if r0 == 0 goto +10 <LBB0_3>
14: 61 71 00 00 00 00 00 00 r1 = *(u32 *)(r7 + 0)
15: 63 10 00 00 00 00 00 00 *(u32 *)(r0 + 0) = r1
16: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
18: 85 00 00 00 12 00 00 00 call 18
19: bf 06 00 00 00 00 00 00 r6 = r0
20: 18 01 00 00 00 00 00 80 00 00 00 00 00 00 00 00 r1 = 2147483648 ll
22: 5f 16 00 00 00 00 00 00 r6 &= r1
23: 77 06 00 00 1f 00 00 00 r6 >>= 31
00000000000000c0 <LBB0_3>:
24: bf 60 00 00 00 00 00 00 r0 = r6
25: 95 00 00 00 00 00 00 00 exit
In my case, I am unable to use ubpf_test
to get a core dump with the x86 JIT code, because my program is in a different (non-Linux kernel) domain and uses its own set of helper functions, that ubpf_test
does not support.
Instead, I got a dump of the buffer generated by ubpf_compile()
in the attached file.
Note that the above example works well up until commit https://github.com/iovisor/ubpf/commit/3fb3da0ffef98be661bebae05043af0f14cfbed4, but leads to a segfault in commit https://github.com/iovisor/ubpf/commit/ff4b48a8a6f062955e3f49cd9dba54624173a974.
@dtcccc I tried to tag you in a comment on e307c30 but I think that I screwed up. So, I am tagging you here, too. Sorry if you got multiple notices.
@hawkinsw I just tested https://github.com/iovisor/ubpf/commit/e307c30bea2adb39399d54a3ebb8ed6fe67e8d9b and it fixes the issue. Thanks!
We found the bpf code was translated incorrectly.
79 74 00 00 00 00 00 00 r4 = *(u64 *)(r7 + 0)
was translated tomov (%r8,%rcx,2),%r10
However, R8 and RCX are both 0, so the prog crashed.
We found this issue is about BPF_REG_7, it is mapped to R12 reg in Linux.
The commit 86a5001d4ad7111bd565d8fc4de9c231b8a9e835 and commit 3a6e60c962f9e1fec96dbf097514db65d3e0a22c tell us that, we should not use R12. But commit ff4b48a8a6f062955e3f49cd9dba54624173a974 brings it back.