systems-nuts / unifico

Compiler and build harness for heterogeneous-ISA binaries with the same stack layout.
3 stars 1 forks source link

Different lowering of `select` to `X86::CMOV` and two-address format of `AArch64::CSEL` #305

Closed blackgeorge-boom closed 10 months ago

blackgeorge-boom commented 10 months ago
#define NA 75000

static int naa;
static int nzz;

int main()
{
  lastrow  = NA-1;
  lastcol  = NA-1;

  printf(" Size: %11d\n", NA);

  naa = NA;
  nzz = NZ;
  randlc(&tran, amult);

  makea(naa, nzz, a, colidx, rowstr,
        0, lastrow, 0, lastcol,
        arow,
        (int (*)[NONZER+1])(void*)acol,
        (double (*)[NONZER+1])(void*)aelt,
        iv);

  return 0;
}
make clean; make stackmaps-check -j10 OBJDUMP_FLAGS= OPT_LEVEL=-O1 TARGET_FUNC=main

WARNING: main: callsite 0 has different number of architecture specific live locations (1 vs 2)
WARNING: main: callsite 1 has different number of architecture specific live locations (0 vs 1)
WARNING: main: callsite 2, value locations 0/0 have different location type (1 vs. 3)
WARNING: main: callsite 2, value locations 0/0 have different location offset or  different constant (0 vs. -28)
blackgeorge-boom commented 10 months ago

There are two issues here. First, we need to convert the AArch64 CSEL into a two-address format. Also, there is an issue with how X86 lowers the select IR instruction:

IR

    t53: i32 = add OpaqueConstant:i32<75000>, Constant:i32<-1>, main.c:101:12
  t54: i32 = select t89, t53, Constant:i32<0>, main.c:101:12

AArch64

  %11:gpr32 = COPY $wzr
...
  %16:gpr32 = CSELWr %14:gpr32(tied-def 0), %11:gpr32, 1, implicit $nzcv, debug-location !76; main.c:101:12

vs

X86

  %2:gr32temp = MOV32r0 implicit-def dead $eflags
...
  %11:gr32 = CMOV32rr %2:gr32temp(tied-def 0), killed %10:gr32, 5, implicit $eflags, debug-location !76; main.c:101:12

For some reason, X86 inverts the two operands of select while lowering them.

blackgeorge-boom commented 10 months ago

From X86IselLowering.cpp

SDValue Ops[] = { Op2, Op1, CC, Cond }; <---
return DAG.getNode(X86ISD::CMOV, DL, Op.getValueType(), Ops);

Not sure why the inverted order.