llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.09k stars 12k forks source link

[llvm-objdump][x86] cs prefixes are not printed for -mindirect-branch-cs-prefix #58201

Open nickdesaulniers opened 2 years ago

nickdesaulniers commented 2 years ago

When the Linux kernel is built with -mindirect-branch-cs-prefix https://lore.kernel.org/all/20220817185410.1174782-1-nathan@kernel.org/ I was using llvm-objdump -d vmlinux to check that indirect calls to __x86_indirect_thunk_r11 contained the cs prefix. It looked like they did not, which was surprising. Triple checking with GNU binutils' objdump, it looks like they are there:

$ llvm-objdump -d vmlinux | grep __x86_indirect_thunk_r11 | head -n 10
ffffffff8100071c: 2e e8 5e 2a 00 01     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff8100075c: 2e e8 1e 2a 00 01     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810007a2: 2e e8 d8 29 00 01     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff81000e49: 2e e8 31 23 00 01     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810027da: 2e e8 a0 09 00 01     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810030dc: 2e e9 9e 00 00 01     jmp 0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810031c8: 2e e8 b2 ff ff 00     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff81003316: 2e e8 64 fe ff 00     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff81003588: 2e e8 f2 fb ff 00     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810035c4: 2e e8 b6 fb ff 00     callq   0xffffffff82003180 <__x86_indirect_thunk_r11>

$ objdump -d vmlinux | grep __x86_indirect_thunk_r11 | head -n 10 
ffffffff8100071c:   2e e8 5e 2a 00 01       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff8100075c:   2e e8 1e 2a 00 01       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810007a2:   2e e8 d8 29 00 01       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff81000e49:   2e e8 31 23 00 01       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810027da:   2e e8 a0 09 00 01       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810030dc:   2e e9 9e 00 00 01       cs jmp ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810031c8:   2e e8 b2 ff ff 00       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff81003316:   2e e8 64 fe ff 00       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff81003588:   2e e8 f2 fb ff 00       cs call ffffffff82003180 <__x86_indirect_thunk_r11>
ffffffff810035c4:   2e e8 b6 fb ff 00       cs call ffffffff82003180 <__x86_indirect_thunk_r11>

You can see the range of bytes in the instruction is correct. cc @phoebewang

llvmbot commented 2 years ago

@llvm/issue-subscribers-tools-llvm-objdump

phoebewang commented 2 years ago

This is a generic problem that current framework in LLVM doesn't support emitting prefixes like GCC during disassembling. For example

void main() {
  asm("cs;cs;cs;mov %eax, %eax");
}

Compile it to .o and dump with objdump:

0000000000000000 <main>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   2e 2e 2e 89 c0          cs cs cs mov %eax,%eax
   9:   90                      nop
   a:   5d                      pop    %rbp
   b:   c3                      retq

dump with llvm-objdump:

0000000000000000 <main>:
       0: 55                            pushq   %rbp
       1: 48 89 e5                      movq    %rsp, %rbp
       4: 2e 2e 2e 89 c0                movl    %eax, %eax
       9: 90                            nop
       a: 5d                            popq    %rbp
       b: c3                            retq
KanRobert commented 1 year ago

I will try to support it.