Sleigh Pcode: Read before write in instruction bundle

leyondlee commented 2 years ago

Hi,

I am in the progress of adding TileGx architecture to Ghidra. With reference to #4390

The processor specification states that: (UG401-ISA.pdf Section 1.2)

Because there is explicitly no implied dependency within a bundle, the semantics for this specify that the input operands for all instructions in a bundle are read before any of the output operands are written.

For example:

move r29, sp; addxi r28, r29, 0xf8; st r28, r29

Should roughly translate to:

# Assume r28 = 0x50, r29 = 0x10, sp = 0x20
r29 = sp;                 # r29 = 0x20
r28 = r29 + sext(0xf8);   # r28 = 0x10 <old r29 value> + sext(0xf8)
*[ram] r28 = r29;         # *[0x50 <old r28 value>] = 0x10 <old r29 value>

My current code:

macros.sinc

@define ENDIAN "little"
@define REGSIZE "8"
@define ADDRSIZE "8"

define endian=$(ENDIAN);
define alignment=8;

define space ram type=ram_space size=$(ADDRSIZE) default;
define space register type=register_space size=$(REGSIZE);
define space temp_register type=ram_space size=$(REGSIZE);

define register offset=0 size=$(REGSIZE) [
    r0 r1 r2 r3 r4
    r5 r6 r7 r8 r9
    r10 r11 r12 r13 r14
    r15 r16 r17 r18 r19
    r20 r21 r22 r23 r24
    r25 r26 r27 r28 r29
    r30 r31 r32 r33 r34
    r35 r236 r37 r38 r39
    r40 r41 r42 r43 r44
    r45 r46 r47 r48 r49
    r50 r51 r52 r53 sp
    lr r56 idn0 idn1 udn0
    udn1 udn2 udn3 zero
];

# Temp registers
define temp_register offset=0x0 size=$(REGSIZE) [
    temp_r0 temp_r1 temp_r2 temp_r3 temp_r4
    temp_r5 temp_r6 temp_r7 temp_r8 temp_r9
    temp_r10 temp_r11 temp_r12 temp_r13 temp_r14
    temp_r15 temp_r16 temp_r17 temp_r18 temp_r19
    temp_r20 temp_r21 temp_r22 temp_r23 temp_r24
    temp_r25 temp_r26 temp_r27 temp_r28 temp_r29
    temp_r30 temp_r31 temp_r32 temp_r33 temp_r34
    temp_r35 temp_r36 temp_r37 temp_r38 temp_r39
    temp_r40 temp_r41 temp_r42 temp_r43 temp_r44
    temp_r45 temp_r46 temp_r47 temp_r48 temp_r49
    temp_r50 temp_r51 temp_r52 temp_r53 temp_sp
    temp_lr temp_r56 temp_idn0 temp_idn1 temp_udn0
    temp_udn1 temp_udn2 temp_udn3
];

define token instr_bundle(64)
    mode = (62,63)

    # X0
    Opcode_X0 = (28,30)
    RRROpcodeExtension_X0 = (18,27)
    SrcB_X0 = (12,17)
    SrcA_X0 = (6,11)
    Dest_X0 = (0,5)
    Imm8OpcodeExtension_X0 = (20,27)
    Imm8_X0 = (12,19)
    Imm16_X0 = (12,27)
    UnaryOpcodeExtension_X0 = (12,17)
    ShiftOpcodeExtension_X0 = (18,27)
    ShAmt_X0 = (12,17)
    BFOpcodeExtension_X0 = (24,27)
    BFStart_X0 = (18,23)
    BFEnd_X0 = (12,17)

    # X1
    Opcode_X1 = (59,61)
    RRROpcodeExtension_X1 = (49,58)
    SrcB_X1 = (43,48)
    SrcA_X1 = (37,42)
    Dest_X1 = (31,36)
    Imm8OpcodeExtension_X1 = (51,58)
    Imm8_X1 = (43,50)
    MT_Imm14_X1_13_6 = (43,50)
    MT_Imm14_X1_5_0 = (31,36)
    MF_Imm14_X1 = (37,50)
    Imm16_X1 = (43,58)
    UnaryOpcodeExtension_X1 = (43,48)
    ShiftOpcodeExtension_X1 = (49,58)
    ShAmt_X1 = (43,48)
    BrType_X1 = (54,58)
    BrOff_X1_16_6 = (43,53)
    BrOff_X1_5_0 = (31,36)
    JumpOpcodeExtension_X1 = (58,58)
    JumpOff_X1 = (31,57)

    # Y0
    Opcode_Y0 = (27,30)
    RRROpcodeExtension_Y0 = (18,19)
    SrcB_Y0 = (12,17)
    SrcA_Y0 = (6,11)
    Dest_Y0 = (0,5)
    Imm8_Y0 = (12,19)
    UnaryOpcodeExtension_Y0 = (12,17)
    ShiftOpcodeExtension_Y0 = (18,19)
    ShAmt_Y0 = (12,17)

    # Y1
    Opcode_Y1 = (58,61)
    RRROpcodeExtension_Y1 = (49,50)
    SrcB_Y1 = (43,48)
    SrcA_Y1 = (37,42)
    Dest_Y1 = (31,36)
    Imm8_Y1 = (43,50)
    UnaryOpcodeExtension_Y1 = (43,48)
    ShiftOpcodeExtension_Y1 = (49,50)
    ShAmt_Y1 = (43,48)

    # Y2
    Opcode_Y2_1_1 = (57,57)
    SrcBDest_Y2 = (51,56)
    Opcode_Y2_0_0 = (26,26)
    SrcA_Y2 = (20,25)
;

define register offset=0x1000 size=8 contextreg;
define context contextreg
    bundleType = (0,1)
    phase = (2,4)
;

attach variables [
    SrcA_X0 SrcB_X0 Dest_X0
    SrcA_X1 SrcB_X1 Dest_X1
    SrcA_Y0 SrcB_Y0 Dest_Y0
    SrcA_Y1 SrcB_Y1 Dest_Y1
    SrcA_Y2 SrcBDest_Y2
] [
    r0 r1 r2 r3 r4
    r5 r6 r7 r8 r9
    r10 r11 r12 r13 r14
    r15 r16 r17 r18 r19
    r20 r21 r22 r23 r24
    r25 r26 r27 r28 r29
    r30 r31 r32 r33 r34
    r35 r236 r37 r38 r39
    r40 r41 r42 r43 r44
    r45 r46 r47 r48 r49
    r50 r51 r52 r53 sp
    lr r56 idn0 idn1 udn0
    udn1 udn2 udn3 zero
];

@include "macros.sinc"

:^instruction is phase=0 & bundleType=0 & mode=0x00 & instruction [ bundleType=1; phase=1; ] {}
:^instruction is phase=0 & bundleType=0 & mode!=0x00 & instruction [ bundleType=2; phase=1; ] {}

# sub constructor to move to the next instruction parse phase
EndPacket:"; "^instruction is phase=1 & instruction [ phase=2; ] { build instruction; }
EndPacket:"; "^instruction is phase=2 & instruction [ phase=3; ] { build instruction; }

# X Format
with : bundleType = 1 {
    :X is bundleType {} # Filler
}

# Y Format
with : bundleType = 2 {
    # Opcode_Y0 = (27,30), RRROpcodeExtension_Y0 = (18,19), SrcB_Y0 = (12,17), SrcA_Y0 = (6,11), Dest_Y0 = (0,5)
    :move Dest_Y0, SrcA_Y0 EndPacket is phase=1 & Opcode_Y0=0xA & RRROpcodeExtension_Y0=0x2 & SrcB_Y0=0x3F & SrcA_Y0 & Dest_Y0 & EndPacket {
        initTempRegisters();                 # Copy values to temp register
        writeToTempRegister(Dest_Y0, SrcA_Y0);
        build EndPacket;
    }

    # Opcode_Y1 = (58,61), Imm8_Y1 = (43,50), SrcA_Y1 = (37,42), Dest_Y1 = (31,36)
    :addxi Dest_Y1, SrcA_Y1, Imm8_Y1 EndPacket is phase=2 & Opcode_Y1=0x2 & Imm8_Y1 & SrcA_Y1 & Dest_Y1 & EndPacket {
        local temp:8 = SrcA_Y1 + sext(Imm8_Y1);
        local res:8 = sext(temp:4);
        writeToTempRegister(Dest_Y1, res);   # Store changes to temp register
        build EndPacket;
    }

    # Opcode_Y2_1_1 = (57,57), SrcBDest_Y2 = (51,56), Opcode_Y2_0_0 = (26,26), SrcA_Y2 = (20,25)
    # don't need EndPacket as this is last instruction
    :st SrcA_Y2, SrcBDest_Y2 is phase=3 & Opcode_Y2_1_1=0x1 & SrcBDest_Y2 & Opcode_Y2_0_0=0x1 & SrcA_Y2 {
        *[ram]:8 SrcA_Y2:$(ADDRSIZE) = SrcBDest_Y2;
         saveTempRegisters();                # Save new register values
    }
}

This works but it breaks the pcode display and function signatures.

Image 1

Image 2

Note: r0, r1, ..., r9 are declared as inputs in compiler_spec

I am wondering if there is a more elegant way to implement this?

leyondlee commented 2 years ago

I tried implementing the initialization and saving of temp registers through pcode injection but I came across a "Low-level Error" with the decompiler on some functions. It seems to be originate from flow.cc#L131.

public void emitAssignRegisterToRegister(String dest, String src) {
    Varnode out = findRegister(dest);
    Varnode[] in = new Varnode[1];
    in[0] = findRegister(src);
    PcodeOp op = new PcodeOp(opAddress, seqnum++, PcodeOp.COPY, in, out);
    opList.add(op);
}

initTempRegisters:

public class InjectInitTempRegisters extends InjectPayloadTileGx {
    public InjectInitTempRegisters(String sourceName, SleighLanguage language, long uniqBase) {
        super(sourceName, language, uniqBase);
    }

    @Override
    public PcodeOp[] getPcode(Program program, InjectContext con) {
        PcodeOpEmitter pCode = new PcodeOpEmitter(language, con.baseAddr, uniqueBase);

        for (String regName : RegisterUtil.REGISTER_NAMES) {
            String tempName = RegisterUtil.getTempName(regName);
            pCode.emitAssignRegisterToRegister(tempName, regName); // temp = reg
        }

        return pCode.getPcodeOps();
    }
}

saveTempRegisters:

public class InjectSaveTempRegisters extends InjectPayloadTileGx {
    public InjectSaveTempRegisters(String sourceName, SleighLanguage language, long uniqBase) {
        super(sourceName, language, uniqBase);
    }

    @Override
    public PcodeOp[] getPcode(Program program, InjectContext con) {
        PcodeOpEmitter pCode = new PcodeOpEmitter(language, con.baseAddr, uniqueBase);

        for (String regName : RegisterUtil.REGISTER_NAMES) {
            String tempName = RegisterUtil.getTempName(regName);
            pCode.emitAssignRegisterToRegister(regName, tempName); // reg = temp
        }

        return pCode.getPcodeOps();
    }
}

astrelsky commented 2 years ago

I tried implementing the initialization and saving of temp registers through pcode injection but I came across a "Low-level Error" with the decompiler on some functions. It seems to be originate from flow.cc#L131.

public void emitAssignRegisterToRegister(String dest, String src) {
  Varnode out = findRegister(dest);
  Varnode[] in = new Varnode[1];
  in[0] = findRegister(src);
  PcodeOp op = new PcodeOp(opAddress, seqnum++, PcodeOp.COPY, in, out);
  opList.add(op);
}

initTempRegisters:

public class InjectInitTempRegisters extends InjectPayloadTileGx {
  public InjectInitTempRegisters(String sourceName, SleighLanguage language, long uniqBase) {
      super(sourceName, language, uniqBase);
  }

  @Override
  public PcodeOp[] getPcode(Program program, InjectContext con) {
      PcodeOpEmitter pCode = new PcodeOpEmitter(language, con.baseAddr, uniqueBase);

      for (String regName : RegisterUtil.REGISTER_NAMES) {
          String tempName = RegisterUtil.getTempName(regName);
          pCode.emitAssignRegisterToRegister(tempName, regName); // temp = reg
      }

      return pCode.getPcodeOps();
  }
}

saveTempRegisters:

public class InjectSaveTempRegisters extends InjectPayloadTileGx {
  public InjectSaveTempRegisters(String sourceName, SleighLanguage language, long uniqBase) {
      super(sourceName, language, uniqBase);
  }

  @Override
  public PcodeOp[] getPcode(Program program, InjectContext con) {
      PcodeOpEmitter pCode = new PcodeOpEmitter(language, con.baseAddr, uniqueBase);

      for (String regName : RegisterUtil.REGISTER_NAMES) {
          String tempName = RegisterUtil.getTempName(regName);
          pCode.emitAssignRegisterToRegister(regName, tempName); // reg = temp
      }

      return pCode.getPcodeOps();
  }
}

I'd recommend avoiding pcode injection when you can. While it may seem easier to use for complicated instructions than implementing it in sleigh, I've been burned a few times by breaking changes to that api.

With that said, unfortunately I come offering no alternatives or other suggestions.

leyondlee commented 2 years ago

Thanks @astrelsky for the insight.

I tried simplifying initTempRegisters() and saveTempRegisters() using a loop and accessing the registers through addresses.

define space register type=register_space size=$(REGSIZE);
...
...
...
define register offset=0x1000 size=$(REGSIZE) [
    temp_r0 temp_r1 temp_r2 temp_r3 temp_r4
    temp_r5 temp_r6 temp_r7 temp_r8 temp_r9
    temp_r10 temp_r11 temp_r12 temp_r13 temp_r14
    temp_r15 temp_r16 temp_r17 temp_r18 temp_r19
    temp_r20 temp_r21 temp_r22 temp_r23 temp_r24
    temp_r25 temp_r26 temp_r27 temp_r28 temp_r29
    temp_r30 temp_r31 temp_r32 temp_r33 temp_r34
    temp_r35 temp_r36 temp_r37 temp_r38 temp_r39
    temp_r40 temp_r41 temp_r42 temp_r43 temp_r44
    temp_r45 temp_r46 temp_r47 temp_r48 temp_r49
    temp_r50 temp_r51 temp_r52 temp_r53 temp_sp
    temp_lr temp_r56 temp_idn0 temp_idn1 temp_udn0
    temp_udn1 temp_udn2 temp_udn3
];

macro initTempRegisters() {
    local i:8 = 0;
<loop>
    local regAddr:8 = i * 8;
    local tempAddr:8 = regAddr + 0x1000;
    local val:8 = *[register] regAddr;
    *[register] tempAddr = val;
    i = i + 1;
    if (i < 63) goto <loop>;
}

macro saveTempRegisters() {
    local i:8 = 0;
<loop>
    local regAddr:8 = i * 8;
    local tempAddr:8 = regAddr + 0x1000;
    local val:8 = *[register] tempAddr;
    *[register] regAddr = val;
    i = i + 1;
    if (i < 63) goto <loop>;
}

But it seems to be decompiling into actual code (even though the access is to register_space and not ram_space)

astrelsky commented 2 years ago

You should be able to access registers via the register address space but it can be a bit finicky. I'll usually get warnings about the decompiler ignoring indirect accesses, which is exactly what is wanted anyway (access the variable as if in a register not like a pointer). I think the variable is causing the trouble here because in my case I started with a constant register address and the offset could be resolved to a constant.

It's going to be pretty ugly, but try unrolling the loop and see what happens.

leyondlee commented 2 years ago

The decompiler works fine when the registers are assigned individually.

macro initTempRegisters() {
    temp_r0 = r0;
    temp_r1 = r1;
    temp_r2 = r2;
    temp_r3 = r3;
    temp_r4 = r4;
    ...
    ...
    ...
}

macro saveTempRegisters() {
    r0 = temp_r0;
    r1 = temp_r1;
    r2 = temp_r2;
    r3 = temp_r3;
    r4 = temp_r4;
    ...
    ...
    ...
}

But the pcode display is truncated and does not show the all the pcodes.

ghidracadabra commented 2 years ago

You can control the number of lines in the pcode field via Edit->Tool Options->Listing Field->Pcode Field (from the code browser)

ghidra1 commented 2 years ago

macro saveTempRegisters() { r0 = temp_r0;

I don't understand the need for the bulk saveTempRegisters macro. Assuming the temp registers are used for all reads, why not let the actual instruction writes target the real register(s)? I assume the intent is to ensure that register reads do not reflect any modifications which may occur within the packed instruction.

I would definitely stay clear of any injection approach which should be uneccessary.

leyondlee commented 2 years ago

You can control the number of lines in the pcode field via Edit->Tool Options->Listing Field->Pcode Field (from the code browser)

Thanks 😃

macro saveTempRegisters() { r0 = temp_r0;

I don't understand the need for the bulk saveTempRegisters macro. Assuming the temp registers are used for all reads, why not let the actual instruction writes target the real register(s)? I assume the intent is to ensure that register reads do not reflect any modifications which may occur within the packed instruction.

I would definitely stay clear of any injection approach which should be uneccessary.

Yes, the reads should not be affected by any register modifications within the instruction bundle.

In the current process, the actual registers are used for all reads while any writes to the registers (within the instruction bundle) are first written to the temp register. The (new) values in the temp register will then be written back to the actual registers after the last instruction in the bundle using the saveTempRegisters macro.

ghidra1 commented 2 years ago

In the current process, the actual registers are used for all reads while any writes to the registers (within the instruction bundle) are first written to the temp register.

I would think the reverse may produce better results within the decompiler. Read from temp shadow copies and write to actual registers only when modified by an instruction. If a temp is never read the copy should be ignored by decompiler.

leyondlee commented 2 years ago

I would think the reverse may produce better results within the decompiler. Read from temp shadow copies and write to actual registers only when modified by an instruction. If a temp is never read the copy should be ignored by decompiler.

Thanks, I will give it a try 👍

Another issue I am currently facing is regarding the function signatures. It seems to detect that all input registers (r0-r9) are part of the function.

<compiler_spec>
    ...
    ...
    ...
    <default_proto>
        <prototype name="default" extrapop="0" stackshift="0">
            <input>
                <pentry minsize="1" maxsize="8">
                    <register name="r0"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r1"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r2"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r3"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r4"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r5"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r6"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r7"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r8"/>
                </pentry>
                <pentry minsize="1" maxsize="8">
                    <register name="r9"/>
                </pentry>
            </input>
            <output>
                <pentry minsize="1" maxsize="8">
                    <register name="r0"/>
                </pentry>
            </output>
        </prototype>
    </default_proto>
</compiler_spec>

GhidorahRex commented 2 years ago

That's also an issue with using the temp registers. It sees the reads from all the registers and assigns them as operands as it sees them all being read from. It should be fixed by swapping the role of read-from vs write-to temporaries.

leyondlee commented 2 years ago

Thanks, that works 😃

ryanmkurtz commented 2 years ago

@leyondlee can this issue be closed now?

leyondlee commented 2 years ago

@leyondlee can this issue be closed now?

Yes, thanks for everyone's help

dsmithington commented 1 year ago

@leyondlee Are you willing to share your final revision of your sinc file?

leyondlee commented 1 year ago

@leyondlee Are you willing to share your final revision of your sinc file?

Hi, I am unfortunately unable to share the file due to a NDA.

However, the work can be replicated with reference to the first post and the TileGx ISA documentation.

Just note to use shadow copies of the registers in the p-code to maintain read before write. For example:

:move Dest_Y0, SrcA_Y0 EndPacket is phase=1 & Opcode_Y0=0xA & RRROpcodeExtension_Y0=0x2 & SrcB_Y0=0x3F & SrcA_Y0 & Dest_Y0 & EndPacket {
    initTempRegisters();                 # Copy values to temp register. temp_r0=r0, temp_r1=r1, temp_r2=r2...
        Dest_Y0 = <SrcA_Y0's Temp Register>;
}

NationalSecurityAgency / ghidra

Sleigh Pcode: Read before write in instruction bundle #4581