llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.41k stars 12.15k forks source link

Why are the llvm API and llvm-objdump disassembly results different for the same version? #102280

Open potatopublic opened 3 months ago

potatopublic commented 3 months ago

My environment.

CPU: Apple M1 OS: ventura 13.0 llvm version: 18.1.8 lldb version: 18.1.8

I attempted disassembly with lldb and llvm-objdump for only 4 bytes of code. However, even though they are clearly the same version, the disassembly results of lldb and llvm-objdump were different. Can anyone explain this phenomenon?

Below is the source code I targeted and the disassembly results.

target 4byte is 0x088dc6c2 // strlb w2, [x22]
byte.c

int main(){

   asm volatile(
   ".byte 0xc2, 0xc6, 0x8d, 0x08\n"
   );

}
./llvm-objdump --version
  Homebrew LLVM version 18.1.8
  Optimized build.

0000000100003fac <_main>:
100003fac: 088dc6c2     stlrb   w2, [x22]     <<<<-------  disassemble successed
100003fb0: 52800000     mov     w0, #0x0                ; =0
100003fb4: d65f03c0     ret
./lldb --version
 lldb version 18.1.8

(lldb) disas --pc
byte4`main:
->  0x100003fac <+0>:                                  ; unknown opcode  <<<---- disassemble failed
    0x100003fb0 <+4>: mov    w0, #0x0 ; =0
    0x100003fb4 <+8>: ret

Additionally, I wrote disassembly code for 4 bytes with the llvm-c header file, and when disassembly was attempted, disassembly failed for 4 bytes targets like lldb.

llvmbot commented 3 months ago

@llvm/issue-subscribers-tools-llvm-objdump

Author: YooSeok (potatopublic)

My environment. CPU: Apple M1 OS: ventura 13.0 llvm version: 18.1.8 lldb version: 18.1.8 I attempted disassembly with lldb and llvm-objdump for only 4 bytes of code. However, even though they are clearly the same version, the disassembly results of lldb and llvm-objdump were different. Can anyone explain this phenomenon? Below is the source code I targeted and the disassembly results. ```` target 4byte is 0x088dc6c2 // strlb w2, [x22] byte.c int main(){ asm volatile( ".byte 0xc2, 0xc6, 0x8d, 0x08\n" ); } ```` ```` ./llvm-objdump --version Homebrew LLVM version 18.1.8 Optimized build. 0000000100003fac <_main>: 100003fac: 088dc6c2 stlrb w2, [x22] <<<<------- disassemble successed 100003fb0: 52800000 mov w0, #0x0 ; =0 100003fb4: d65f03c0 ret ```` ```` ./lldb --version lldb version 18.1.8 (lldb) disas --pc byte4`main: -> 0x100003fac <+0>: ; unknown opcode <<<---- disassemble failed 0x100003fb0 <+4>: mov w0, #0x0 ; =0 0x100003fb4 <+8>: ret ```` Additionally, I wrote disassembly code for 4 bytes with the llvm-c header file, and when disassembly was attempted, disassembly failed for 4 bytes targets like lldb.
llvmbot commented 3 months ago

@llvm/issue-subscribers-lldb

Author: YooSeok (potatopublic)

My environment. CPU: Apple M1 OS: ventura 13.0 llvm version: 18.1.8 lldb version: 18.1.8 I attempted disassembly with lldb and llvm-objdump for only 4 bytes of code. However, even though they are clearly the same version, the disassembly results of lldb and llvm-objdump were different. Can anyone explain this phenomenon? Below is the source code I targeted and the disassembly results. ```` target 4byte is 0x088dc6c2 // strlb w2, [x22] byte.c int main(){ asm volatile( ".byte 0xc2, 0xc6, 0x8d, 0x08\n" ); } ```` ```` ./llvm-objdump --version Homebrew LLVM version 18.1.8 Optimized build. 0000000100003fac <_main>: 100003fac: 088dc6c2 stlrb w2, [x22] <<<<------- disassemble successed 100003fb0: 52800000 mov w0, #0x0 ; =0 100003fb4: d65f03c0 ret ```` ```` ./lldb --version lldb version 18.1.8 (lldb) disas --pc byte4`main: -> 0x100003fac <+0>: ; unknown opcode <<<---- disassemble failed 0x100003fb0 <+4>: mov w0, #0x0 ; =0 0x100003fb4 <+8>: ret ```` Additionally, I wrote disassembly code for 4 bytes with the llvm-c header file, and when disassembly was attempted, disassembly failed for 4 bytes targets like lldb.
DavidSpickett commented 3 months ago

Which compiler and what command line are you using to compile the example?

I suspect that .byte may be marked as data in the object file and that could cause dissemblers to ignore it. That's what I saw with gcc and its objdump.

Usually raw encodings are written with .instr.

DavidSpickett commented 3 months ago

Not .instr, .inst with no r: https://sourceware.org/binutils/docs/as/AArch64-Directives.html