Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

arm asm label calculation error in sub #24345

Closed Quuxplusone closed 8 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR24346
Status RESOLVED FIXED
Importance P normal
Reported by Han Shen (shenhan@google.com)
Reported on 2015-08-03 18:54:14 -0700
Last modified on 2016-04-01 05:07:36 -0700
Version trunk
Hardware PC Linux
CC kristof.beyls@gmail.com, laszio@google.com, llozano@chromium.org, llvm-bugs@lists.llvm.org, rengolin@gmail.com
Fixed by commit(s)
Attachments
Blocks PR18926, PR24345
Blocked by
See also
The label calculation is not correct in the following simple ARM asm

.text
.syntax unified
.code 32
.type DATA,%object
.align 5
DATA:
.word 0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
.word 0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
.word 0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
.word 0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
.word 0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
.word 0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
.word 0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
.word 0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
.word 0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
.word 0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
.word 0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
.word 0xd192e819,0xd6990624,0xf40e3585,0x106aa070
.word 0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
.word 0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
.word 0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
.word 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
.word 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
.word 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
.word 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
.word 0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
.word 0x90befffa,0xa4506ceb,0x90befffa,0xa4506ceb
.size DATA,.-DATA
.word 0 @ terminator

.global FOO
.type FOO,%function
.align 5
FOO:
.L1:
 adr r3,.L1
 sub r3, r3, #(.L1-DATA)
 @ Now r3 should point to start of DATA (0)

.size FOO,.-FOO

This is what I get from gnu-as-
00000160 <FOO>:
 160:   e24f3008        sub     r3, pc, #8      ; Now r3 is 0x160
 164:   e2433e16        sub     r3, r3, #352    ; 0x160

The 2 subs results 0 (which is the correct starting address) in r3

Now this is what I get from clang -
00000160 <FOO>:
 160:   e24f3008        sub     r3, pc, #8      ; Now r3 is 0x160
 164:   e2433160        sub     r3, r3, #96, 2  ; After this r3 is not zero

The 2 subs results in a different value than the start address of DATA. (The
relocation section of the object file is empty)
Quuxplusone commented 9 years ago
Looks like a problem with instruction format:

352 = 0x160 = (2 >> 1) << 8 | 96 = encoding of arm operand 2 #96, 2

llvm-mc -show-inst seems to have a similar problem:

3606 = 0xe16 = (28 >> 1) << 8 | 22 = encoding of arm operand 2 #22, 28 (which
is 352)

    sub r5, r5, #352            @ <MCInst #468 SUBri
                                        @  <MCOperand Reg:71>
                                        @  <MCOperand Reg:71>
                                        @  <MCOperand Imm:3606>
                                        @  <MCOperand Imm:14>
                                        @  <MCOperand Reg:0>
                                        @  <MCOperand Reg:0>>
    sub r5, r5, #96, #2         @ <MCInst #468 SUBri
                                        @  <MCOperand Reg:71>
                                        @  <MCOperand Reg:71>
                                        @  <MCOperand Imm:352>
                                        @  <MCOperand Imm:14>
                                        @  <MCOperand Reg:0>
                                        @  <MCOperand Reg:0>>
Quuxplusone commented 9 years ago

The labels or expressions in the last operand of subs are matched by ModImm, which gets an FK_Data_4 fixup that patches the 12 LSBs of the instruction directly after the offsets of the labels or the value of the expressions are decided.

Quuxplusone commented 8 years ago

getModImmOpValue() emits FK_Data_4 which patches the lowest 12 bits of instructions by the value bit-to-bit. This is incompatible with the modified-immediate encoding. The proposed solution is to make a new fixup kind: fixup_arm_mod_imm which takes care of the encoding.

http://reviews.llvm.org/D15442

Quuxplusone commented 8 years ago

Verified patch in comment 3), with the patch, we are now finally able to bring up ARM ChromeOS built with clang.