[RFC][Sleigh] Add base implementation for MIPS (WIP)

m4xw commented 5 months ago

As discussed on Slack, here is a quick PR with my current WIP state for MIPS Support (specifically N64 / vr4300 and PS1 later on). We will need to figure how we split mips32/64 big/little endian variants as well as mips64 with 32bit addr bus in both flavours in a sane matter. I just treat the mips64_32_be target basically as a generic mips target in remill right now, this needs to be reworked.

Some more register stuff needs to be reworked too, but in my testing it works quite nice already. Hope i didnt have a oversight when i squashed some ppc related changes out..

I run low on time now but can add more infos regarding this PR later on or as requested.

I also noticed MaxInstructionSize needs to permit the idom, I think it merges delayslot together? All mips opcodes are expected to be u32, would be nice to get that confirmed as reason.

Requesting Review from @2over12

CLAassistant commented 5 months ago

All committers have signed the CLA.

m4xw commented 5 months ago

Example use:

remill-lift-17 --arch mips --address 0x801C1228 --logtostderr --ir_out /dev/stdout --bytes 27bdffe8afbf001490820000008038252405fffe1040000924060001240100011041000a2405fffe240100021041000c000000001000000d8fbf00140c070aec8ce40004100000098fbf00148ce400040c070aec00003025100000048fbf00140c070ff08ce400048fbf001427bd001803e00008000000003c02802b8c42b078

Not sure why it still creates these in this case: _sub_801c3fc0_2 sub_801c2bb0_2

specifically also 2 calls of sub_801c2bb0 will be turned into one being sub_801c2bb0 and the other being sub_801c2bb0_1, in my testing i manually worked around that

m4xw commented 4 months ago

Resolved the reference issue in the last commit

m4xw commented 4 months ago

Resolved the reference issue in the last commit

I am not sure if this change will be required as well, i dont seem to hit that codepath yet

m4xw commented 4 months ago

Whats the minimum changes required by you (upstream) to accept the merge? Want me to move all arch name shenanigans to a generic mips64be target until they get eventually split up further? Also i still need feedback for the change introduced in https://github.com/lifting-bits/remill/pull/698/commits/67911a504f27fa3214ff817cfbd01a0e5eb25172 and if that may cause issues for some of yours tools (and isnt the same change needed in sleigh lifter as well?)

@2over12 still awaiting your review too

m4xw commented 4 months ago

Good enough to be undrafted IMO, just need to rewrite the arch name stuff to mips64_32 for now (if there are no other suggestions)

m4xw commented 4 months ago

Demo Outputs:

Input:

.boot:801C1228 sub_801C1228:                            # CODE XREF: sub_801C1184+54↑p
.boot:801C1228                 addiu   $sp, -0x18
.boot:801C122C                 sw      $ra, 0x14($sp)
.boot:801C1230                 lbu     $v0, 0($a0)
.boot:801C1234                 move    $a3, $a0
.boot:801C1238                 li      $a1, 0xFFFFFFFE
.boot:801C123C                 beqz    $v0, loc_801C1264
.boot:801C1240                 li      $a2, 1
.boot:801C1244                 li      $at, 1
.boot:801C1248                 beq     $v0, $at, loc_801C1274
.boot:801C124C                 li      $a1, 0xFFFFFFFE
.boot:801C1250                 li      $at, 2
.boot:801C1254                 beq     $v0, $at, loc_801C1288
.boot:801C1258                 nop
.boot:801C125C                 b       loc_801C1294
.boot:801C1260                 lw      $ra, 0x14($sp)
.boot:801C1264  # ---------------------------------------------------------------------------
.boot:801C1264
.boot:801C1264 loc_801C1264:                            # CODE XREF: sub_801C1228+14↑j
.boot:801C1264                 jal     sub_801C2BB0
.boot:801C1268                 lw      $a0, 4($a3)
.boot:801C126C                 b       loc_801C1294
.boot:801C1270                 lw      $ra, 0x14($sp)
.boot:801C1274  # ---------------------------------------------------------------------------
.boot:801C1274
.boot:801C1274 loc_801C1274:                            # CODE XREF: sub_801C1228+20↑j
.boot:801C1274                 lw      $a0, 4($a3)
.boot:801C1278                 jal     sub_801C2BB0
.boot:801C127C                 move    $a2, $zero
.boot:801C1280                 b       loc_801C1294
.boot:801C1284                 lw      $ra, 0x14($sp)
.boot:801C1288  # ---------------------------------------------------------------------------
.boot:801C1288
.boot:801C1288 loc_801C1288:                            # CODE XREF: sub_801C1228+2C↑j
.boot:801C1288                 jal     sub_801C3FC0
.boot:801C128C                 lw      $a0, 4($a3)
.boot:801C1290                 lw      $ra, 0x14($sp)
.boot:801C1294
.boot:801C1294 loc_801C1294:                            # CODE XREF: sub_801C1228+34↑j
.boot:801C1294                                          # sub_801C1228+44↑j ...
.boot:801C1294                 addiu   $sp, 0x18
.boot:801C1298                 jr      $ra
.boot:801C129C                 nop

IDA:

int __fastcall sub_801C1228(unsigned __int8 *a1)
{
  int result; // $v0

  result = *a1;
  if ( !*a1 )
    return sub_801C2BB0(*((_DWORD *)a1 + 1), -2, 1);
  if ( result == 1 )
    return sub_801C2BB0(*((_DWORD *)a1 + 1), -2, 0);
  if ( result == 2 )
    return sub_801C3FC0(*((_DWORD *)a1 + 1));
  return result;
}

__int64 __fastcall sub_801c1228(MIPSState *a1, int a2, __int64 a3)
{
  int v3; // eax
  int v4; // eax
  int v5; // eax
  Reg *p_pc; // [rsp+8h] [rbp-1C0h]
  int v8; // [rsp+18h] [rbp-1B0h]
  int v9; // [rsp+18h] [rbp-1B0h]
  int v10; // [rsp+18h] [rbp-1B0h]
  int v11; // [rsp+18h] [rbp-1B0h]
  int v12; // [rsp+18h] [rbp-1B0h]
  int v13; // [rsp+18h] [rbp-1B0h]
  int v14; // [rsp+18h] [rbp-1B0h]
  __int64 v15; // [rsp+94h] [rbp-134h]
  bool v16; // [rsp+9Fh] [rbp-129h]
  bool v17; // [rsp+ABh] [rbp-11Dh]

  p_pc = &a1->gpr.pc;
  a1->gpr.pc.dword = a2 + 4;
  a1->gpr.sp.dword -= 24;
  a1->gpr.sp.qword = (int)a1->gpr.sp.dword;
  a1->gpr.pc.dword = a2 + 8;
  v15 = _remill_write_memory_32(a3, a1->gpr.sp.dword + 20, a1->gpr.ra.dword);
  a1->gpr.pc.dword = a2 + 12;
  a1->gpr.v0.qword = (unsigned __int8)_remill_read_memory_8(v15, (unsigned int)a1->gpr.a0.qword);
  a1->gpr.pc.dword = a2 + 16;
  a1->gpr.a3.qword = a1->gpr.a0.qword;
  a1->gpr.pc.dword = a2 + 20;
  a1->gpr.a1.qword = -2LL;
  a1->gpr.pc.dword = a2 + 24;
  v16 = a1->gpr.v0.qword == 0;
  a1->gpr.a2.qword = 1LL;
  v3 = 0x801C1264;
  if ( !v16 )
    v3 = a2 + 28;
  v8 = v3;
  if ( v16 )
  {
    p_pc->dword = v3 + 4;
    a1->gpr.ra.qword = 0x801C126CLL;
    a1->gpr.a0.qword = (int)_remill_read_memory_32(v15, a1->gpr.a3.dword + 4);
    p_pc->dword = 0x801C2BB0;
    sub_801c2bb0(a1, p_pc->dword, v15);
    p_pc->dword = v8 + 12;
    a1->gpr.ra.qword = (int)_remill_read_memory_32(v15, a1->gpr.sp.dword + 20);
    v14 = 0x801C1294;
  }
  else
  {
    p_pc->dword = v3 + 4;
    a1->gpr.at.qword = 1LL;
    p_pc->dword = v3 + 8;
    v9 = v3 + 12;
    v17 = a1->gpr.v0.qword == a1->gpr.at.qword;
    a1->gpr.a1.qword = -2LL;
    v4 = 0x801C1274;
    if ( !v17 )
      v4 = v9;
    if ( v17 )
    {
      p_pc->dword = v4 + 4;
      v10 = v4 + 4;
      a1->gpr.a0.qword = (int)_remill_read_memory_32(v15, a1->gpr.a3.dword + 4);
      p_pc->dword = v10 + 4;
      a1->gpr.ra.qword = 0x801C1280LL;
      a1->gpr.a2.qword = 0LL;
      p_pc->dword = 0x801C2BB0;
      sub_801c2bb0(a1, p_pc->dword, v15);
      p_pc->dword = v10 + 12;
      a1->gpr.ra.qword = (int)_remill_read_memory_32(v15, a1->gpr.sp.dword + 20);
      v14 = 0x801C1294;
    }
    else
    {
      p_pc->dword = v4 + 4;
      a1->gpr.at.qword = 2LL;
      p_pc->dword = v4 + 8;
      v11 = v4 + 12;
      v5 = 0x801C1288;
      if ( a1->gpr.v0.qword != a1->gpr.at.qword )
        v5 = v11;
      v12 = v5;
      if ( a1->gpr.v0.qword == a1->gpr.at.qword )
      {
        p_pc->dword = v5 + 4;
        a1->gpr.ra.qword = 0x801C1290LL;
        a1->gpr.a0.qword = (int)_remill_read_memory_32(v15, a1->gpr.a3.dword + 4);
        p_pc->dword = 0x801C3FC0;
        sub_801c3fc0((__int64)a1, p_pc->dword, v15);
        v13 = v12 + 8;
        p_pc->dword = v13 + 4;
        v14 = v13 + 4;
        a1->gpr.ra.qword = (int)_remill_read_memory_32(v15, a1->gpr.sp.dword + 20);
      }
      else
      {
        p_pc->dword = v5 + 4;
        a1->gpr.ra.qword = (int)_remill_read_memory_32(v15, a1->gpr.sp.dword + 20);
        v14 = 0x801C1294;
      }
    }
  }
  p_pc->dword = v14 + 4;
  a1->gpr.sp.dword += 24;
  a1->gpr.sp.qword = (int)a1->gpr.sp.dword;
  p_pc->dword = v14 + 8;
  LOBYTE(a1->flags.ISAModeSwitch.dword) = (a1->gpr.ra.qword & 1) != 0;
  a1->gpr.pc.qword = a1->gpr.ra.qword & 0xFFFFFFFFFFFFFFFELL;
  p_pc->dword = a1->gpr.pc.qword;
  return _remill_function_return(a1, p_pc->dword, v15);
}

m4xw commented 4 months ago

Did some small changes to pass arch down to pcodecfg etc and implement the workaround for now for the discussed branch likely issue, a issue to document it will follow Also pushed the Lift.cpp changes for now, I guess its up to you guys if you want to keep that change in the lifter demo

pgoodman commented 3 months ago

@m4xw can you provide an example binary that manifests the issue described here: https://github.com/lifting-bits/remill/blob/bbb3fa58017f4aae628619b4153d2b7ed612d94d/lib/BC/PcodeCFG.cpp#L125

Also, it would help if you had a screenshot from ghidra showing the relevant assembly + pcode.

m4xw commented 3 months ago

Actually found some time to provide the info now before going home..

Ghidra PCode:

It affects all "likely" opcodes in MIPS:

:beql RSsrc, RTsrc, Rel16       is $(AMODE) & REL6=0 & prime=0x14 & RSsrc & RTsrc & Rel16 {
    if (!(RSsrc==RTsrc)) goto inst_next; 
    delayslot(1); 
    goto Rel16; 
}

Here is a normal op as ref:

:beq RSsrc, RTsrc, Rel16        is $(AMODE) & prime=4 & RSsrc & RTsrc & Rel16 {
    delayflag:1 = ( RSsrc == RTsrc ); 
    delayslot( 1 ); 
    if delayflag goto Rel16; 
}

"0x80011700.hex"

10C000250000000030A500FF0005120000A228250005140000A22825308200011040000400803821A08500002487000124C6FFFF2CC20002144000080006188230E20002104000062463FFFFA4E5000024E7000224C6FFFE000618822463FFFF2402FFFF5062000630C20002ACE500002463FFFF1462FFFD24E7000430C200021040000330C20001A4E5000024E7000254400001A0E5000003E0000800801021

Reference usage:

mkdir -p hex/lifted/obj
rm -f hex/lifted/*.ll
rm -f hex/lifted/obj/*.o

for file in hex/0x80011700.hex; do
    address=$(basename $file .hex)
    echo "Processing $address"
    remill-lift-17 --arch mips --address $address --logtostderr --ir_out hex/lifted/$address.ll --bytes $(cat $file) 2>> log
done

# Call llc on all files in hex/lifted and output them to hex/lifted/obj
for file in hex/lifted/*.ll; do
    address=$(basename $file .ll)
    echo "Compiling $address"
     llc-17 -O0 -march=x86-64  -debugger-tune=lldb -filetype=obj -o hex/lifted/obj/$address.o $file
done

Here is what it looks without the hack:

Here is what it looks like with the hack:

m4xw commented 3 months ago

Also IDA disas:

CODE:80011700  # _BYTE *__fastcall sub_80011700(_BYTE *, unsigned __int8, unsigned int)
CODE:80011700 sub_80011700:                            # CODE XREF: sub_800004CC+64↑p
CODE:80011700                                          # sub_80000EDC+48↑p ...
CODE:80011700                 beqz    $a2, locret_80011798
CODE:80011704                 nop
CODE:80011708                 andi    $a1, 0xFF
CODE:8001170C                 sll     $v0, $a1, 8
CODE:80011710                 or      $a1, $v0
CODE:80011714                 sll     $v0, $a1, 16
CODE:80011718                 or      $a1, $v0
CODE:8001171C                 andi    $v0, $a0, 1
CODE:80011720                 beqz    $v0, loc_80011734
CODE:80011724                 move    $a3, $a0
CODE:80011728                 sb      $a1, 0($a0)
CODE:8001172C                 addiu   $a3, $a0, 1
CODE:80011730                 addiu   $a2, -1
CODE:80011734
CODE:80011734 loc_80011734:                            # CODE XREF: sub_80011700+20↑j
CODE:80011734                 sltiu   $v0, $a2, 2
CODE:80011738                 bnez    $v0, loc_8001175C
CODE:8001173C                 srl     $v1, $a2, 2
CODE:80011740                 andi    $v0, $a3, 2
CODE:80011744                 beqz    $v0, loc_80011760
CODE:80011748                 addiu   $v1, -1
CODE:8001174C                 sh      $a1, 0($a3)
CODE:80011750                 addiu   $a3, 2
CODE:80011754                 addiu   $a2, -2
CODE:80011758                 srl     $v1, $a2, 2
CODE:8001175C
CODE:8001175C loc_8001175C:                            # CODE XREF: sub_80011700+38↑j
CODE:8001175C                 addiu   $v1, -1
CODE:80011760
CODE:80011760 loc_80011760:                            # CODE XREF: sub_80011700+44↑j
CODE:80011760                 li      $v0, 0xFFFFFFFF
CODE:80011764                 beql    $v1, $v0, loc_80011780
CODE:80011768                 andi    $v0, $a2, 2
CODE:8001176C
CODE:8001176C loc_8001176C:                            # CODE XREF: sub_80011700+74↓j
CODE:8001176C                 sw      $a1, 0($a3)
CODE:80011770                 addiu   $v1, -1
CODE:80011774                 bne     $v1, $v0, loc_8001176C
CODE:80011778                 addiu   $a3, 4
CODE:8001177C                 andi    $v0, $a2, 2
CODE:80011780
CODE:80011780 loc_80011780:                            # CODE XREF: sub_80011700+64↑j
CODE:80011780                 beqz    $v0, loc_80011790
CODE:80011784                 andi    $v0, $a2, 1
CODE:80011788                 sh      $a1, 0($a3)
CODE:8001178C                 addiu   $a3, 2
CODE:80011790
CODE:80011790 loc_80011790:                            # CODE XREF: sub_80011700:loc_80011780↑j
CODE:80011790                 bnezl   $v0, locret_80011798
CODE:80011794                 sb      $a1, 0($a3)
CODE:80011798
CODE:80011798 locret_80011798:                         # CODE XREF: sub_80011700↑j
CODE:80011798                                          # sub_80011700:loc_80011790↑j
CODE:80011798                 jr      $ra
CODE:8001179C                 move    $v0, $a0
CODE:8001179C  # End of function sub_80011700

lifting-bits / remill

[RFC][Sleigh] Add base implementation for MIPS (WIP) #698