systems-nuts / unifico

Compiler and build harness for heterogeneous-ISA binaries with the same stack layout.
3 stars 1 forks source link

Different regalloc for wide multiply-add instructions #292

Closed blackgeorge-boom closed 10 months ago

blackgeorge-boom commented 10 months ago
#include <stdio.h>

void results(char *name, int n1, int n2, int n3)
{
    long nn = n1;
    if (n2 != 0)
        nn *= n2;
    printf("%ld%d%4d\n", nn, n1, n2);
}

int main() {
    results("IS", 1, 64, 0);
    return 0;
}
make clean; make stackmaps-check -j10 OBJDUMP_FLAGS= OPT_LEVEL=-O1 

 [STACKMAPS CHECK] Checking stackmaps for main_aarch64_aligned.out main_x86_64_aligned.out
make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make rule.
make[1]: Entering directory '/home/blackgeorge/Documents/phd/unified_abi/stack-metadata'
make[1]: Nothing to be done for 'all'.
make[1]: Leaving directory '/home/blackgeorge/Documents/phd/unified_abi/stack-metadata'
WARNING: results: callsite 0, value locations 0/0 have different location type (1 vs. 3)
WARNING: results: callsite 0, value locations 0/0 have different location offset or  different constant (0 vs. -32)
WARNING: results: callsite 0, value locations 2/2 have different location type (3 vs. 1)
WARNING: results: callsite 0, value locations 2/2 have different location offset or  different constant (-32 vs. 0)
ERROR: stackmaps in 'main_aarch64_aligned.out' & 'main_x86_64_aligned.out' differ - different stack layout!
make: *** [../../common/common.mk:248: stackmaps-check] Error 1
blackgeorge-boom commented 10 months ago

Probably the issue is that we haven't yet implemented the two-address format for this kind of multiply-add instructions:

AArch64:

0B  bb.0.entry:
      liveins: $w1, $w2
16B   %2:gpr32common = COPY $w2
32B   %1:gpr32 = COPY $w1
48B   dead $wzr = SUBSWri %2:gpr32common, 0, 0, implicit-def $nzcv
64B   %5:gpr32 = CSINCWr %2:gpr32common, $wzr, 1, implicit killed $nzcv
80B   %10:gpr64 = nsw SMADDLrrr %5:gpr32, %1:gpr32, $xzr <---
...

X86:

0B  bb.0.entry:
      liveins: $esi, $edx
16B   %2:gr32 = COPY $edx
32B   %1:gr32 = COPY $esi
48B   %15:gr64 = MOVSX64rr32 %1:gr32
64B   TEST32rr %2:gr32, %2:gr32, implicit-def $eflags
80B   %6:gr32 = MOV32ri 1
96B   %6:gr32 = CMOV32rr %6:gr32(tied-def 0), %2:gr32, 5, implicit killed $eflags
112B      %8:gr64 = MOVSX64rr32 %6:gr32
124B      %16:gr64 = COPY %15:gr64
132B      MOV64mr %stack.0, 1, $noreg, 0, $noreg, %16:gr64 :: (store 8 into %stack.0)
140B      %8:gr64 = nsw IMUL64rr %8:gr64(tied-def 0), %16:gr64, implicit-def dead $eflags <---
...