`INC32r`/`DEC32r` are converted to `LEA64_32r` instead of `LEA32r`

systems-nuts / unifico

Compiler and build harness for heterogeneous-ISA binaries with the same stack layout.

4 stars 1 forks source link

void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k + 1]; v[k + 1] = temp; } void sort(int v[], int n) { int i, j; for (i = 0; i < n; i += 1) { j = i - 1; // <--- swap(v, j); } } int main() { int v[5] = {4, 1, 3, 2, 1}; sort(v, 5); return 0; }

0000000000501050 <sort>: ... 50105b: 85 f6 test esi,esi 50105d: 0f 8e 36 00 00 00 jle 501099 <sort+0x49> 501063: 48 89 fb mov rbx,rdi 501066: 31 c0 xor eax,eax <--- 501068: 89 75 dc mov DWORD PTR [rbp-0x24],esi 50106b: 48 89 45 e0 mov QWORD PTR [rbp-0x20],rax <--- 50106f: 44 8d 78 ff lea r15d,[rax-0x1] <--- ...

0000000000501050 sort: ... 501060: 3f 04 00 71 cmp w1, #0x1 501064: ab 01 00 54 b.lt #0x34 <sort+0x48> 501068: f3 03 00 aa mov x19, x0 50106c: e8 03 1f 2a mov w8, wzr <--- 501070: e1 03 00 b9 str w1, [sp] 501074: 14 05 00 51 sub w20, w8, #0x1 <--- 501078: e0 03 13 aa mov x0, x19 50107c: e1 03 14 2a mov w1, w20 501080: e8 07 00 b9 str w8, [sp, #0x4] <--- ...

Looking at how the INC32r/DEC32r is converted into LEA instructions in X86InstrInfo::convertToThreeAddress:

case X86::INC32r: {
...
    unsigned Opc = MIOpc == X86::INC64r ? X86::LEA64r :
        (Is64Bit ? X86::LEA64_32r : X86::LEA32r);

We go directly to LEA64_32r in the case of 64-bit subtargets, instead of LEA32r. The former uses 64-bit registers as arguments, even though our INC32r instruction requires just 32 bits. The register usage is the same essentially, but for example, if we spill these registers we end up in 8-byte spills in the first case, instead of 4-byte spills.

The reason that LEA32r is avoided is that when we want to use 32-bit operands, the extra 0x67 prefix is needed to encode the instruction: https://stackoverflow.com/questions/59153772/address-size-override-prefix-in-64-bit-or-using-64-bit-registers.

For now, we can enforce the use of LEA32r instead.

systems-nuts / unifico

`INC32r`/`DEC32r` are converted to `LEA64_32r` instead of `LEA32r` #259