trailofbits / maat

Open-source symbolic execution framework: https://maat.re
Other
612 stars 43 forks source link

Testing PPC 64-bit stw instruction isn't getting emulated properly by maat #173

Closed Katastropic2431 closed 1 year ago

Katastropic2431 commented 1 year ago

Currently, I am working on integrating PowerPC 64-bit support into Maat by following the contributing.md guide. During my testing of the "stw" (store word) instruction, I noticed that it is storing 64-bits instead of the expected 32-bits. As a result, the stored bits are being placed at the wrong offset.

Here is a screenshot from my testing of stw, lwz and std. Screenshot from 2023-07-20 13-47-47 In the screenshot, I am testing both the "store double word" and "store word" instructions. Surprisingly, both of these instructions yield the same intermediate representation (IR) output, even though the "store double word" instruction should store a 64-bit value, while the "store word" instruction should store a 32-bit value.

To investigate the issue further, I decided to examine the sleigh specifications for these two instructions. As you can see below, the two instructions are indeed different: For "Store Double Word" ('std'):

:std S, dsPlusRaOrZeroAddress is $(NOTVLE) & OP=62 & S & dsPlusRaOrZeroAddress & BITS_0_1=0
{
    *:8(dsPlusRaOrZeroAddress) = S;
}

For "Store Word" ('stw'):

:stw S, dPlusRaOrZeroAddress is $(NOTVLE) & OP=36 & S & dPlusRaOrZeroAddress
{
    @ifdef BIT_64
        *:4(dPlusRaOrZeroAddress) = S:4;
    @else
        *:4(dPlusRaOrZeroAddress) = S;
    @endif
}

Additionally, I attempted to emulate the "stw" instruction using Ghidra emulation, and I couldn't reproduce the error that I encountered in Maat.

For my implementation, I am using the ppc_64_be.slaspec and ppc_64.pspec files.

I would greatly appreciate any insights or suggestions as to why this discrepancy and error are occurring in Maat. Thank you!

Katastropic2431 commented 1 year ago

I am still working on this issue, any helpful input will be greatly appreciated.

Boyan-MILANOV commented 1 year ago

Hey @Katastropic2431 thank you for sharing this issue! Is your branch available somewhere? It would greatly help if I could have a look at the code.

Since in your post we see that the generated p-code does indeed store a 64-bits value my intuition is that the problem arises during sleigh -> maat p-code translation. One possible explanation of the issue is that when converting from sleigh registers to "Maat" registers you might have an error were you map a 32-bits register to a 64-bits one, resulting in the p-code you posted. That should be easy for you to check. If not then I can have a deeper look once you push your code somewhere :)

Katastropic2431 commented 1 year ago

Hey @Boyan-MILANOV I'm sure that I have all my registers mapped correctly but I do agree the problem is during sleigh -> maat pcode translation. I have invited you to my private github repo for you to have a deeper look at my code. Let me know if you have yet to receive the email invite :)

Boyan-MILANOV commented 1 year ago

@Katastropic2431 So I've done a bit of debugging on your branch and even the pcode generated by sleigh uses of a 64-bits value instead of a 32-bits one. So the problem seems not to be in the translation pcode -> maat but in the pcode generation step 🤔. One possible cause would be if the slaspec is somehow compiled without BIT_64 being set, resulting in storing the full register instead of extracting the 4 first bytes:

#stw r0,r0,0        0x90 00 00 00
:stw S,dPlusRaOrZeroAddress     is $(NOTVLE) & OP=36 & S & dPlusRaOrZeroAddress
{
@ifdef BIT_64
    *:4(dPlusRaOrZeroAddress) = S:4;
@else
    *:4(dPlusRaOrZeroAddress) = S;
@endif
}

I don't have the time now to look deeper to confirm if this is the problem, let me know if that helps. It could also be useful to check the generated .sla file to see if the generated spec uses a 32 or 64-bits value. I've pushed a bit of the debugging code in your fork under boyan/debug-stwu if that can be useful to you. I'll come back to this later, let me know if you find anything interesting or manage to solve it :)

Katastropic2431 commented 1 year ago

Hello @Boyan-MILANOV image I added print statements to see the initial pcode generation. The pcode above is being correctly generated as you can see STW only uses 4 bytes when storing bytes into memory while the STD uses 8 bytes. So I believe the problem occurs between pcode and intermediate representation.

An important observation I've made is that load word and load double word instructions are functioning correctly. This difference in behavior between load and store instructions could potentially hold a clue to finding a solution. If I can pinpoint the underlying reason why store instructions are not working while load instructions are, it might lead me closer to resolving the issue: image

#lwz    r0,4(0)     0x80 00 00 04
#lwz    r0,4(r2)    0x80 02 00 04
:lwz    D,dPlusRaOrZeroAddress  is $(NOTVLE) & OP=32 & D & dPlusRaOrZeroAddress
{
@ifdef BIT_64
    D = zext(*:4(dPlusRaOrZeroAddress));
@else
    D = *:4(dPlusRaOrZeroAddress);
@endif
}

Also I want to add that the BIT_64 is set and this can be confirmed through maat being able to interpret the 'store double word' instruction. This indicates that its not a problem that stems from the generated .sla file.

@ifdef BIT_64  # **<--- Maat wouldn't be able to interpret std if BIT_64 isn't set** 
:std S,dsPlusRaOrZeroAddress    is $(NOTVLE) & OP=62 & S & dsPlusRaOrZeroAddress & BITS_0_1=0
{
    *:8(dsPlusRaOrZeroAddress) = S;
}

Below is ghidra's pcode generation that I cross-reference to confirm the initial pcode generation is correct: image

Katastropic2431 commented 1 year ago

Hey @Boyan-MILANOV I fixed it! I noticed that maat didn't take into account the size of the p-code when it comes to the registers. Just had to take that into consideration within the translate_pcode_param function :)

maat::ir::Param reg_name_to_maat_reg(maat::Arch::Type arch, const std::string& reg_name, const int reg_size)
{
    if (arch == Arch::Type::X86)
        return sleigh_reg_translate_X86(reg_name);
    else if (arch == Arch::Type::X64)
        return sleigh_reg_translate_X64(reg_name);
    else if (arch == Arch::Type::EVM)
        return sleigh_reg_translate_EVM(reg_name);
    else if (arch == Arch::Type::PPC64){
        maat::ir::Param tmpReg = sleigh_reg_translate_PPC64(reg_name);
        if(reg_size == tmpReg.size())
            return sleigh_reg_translate_PPC64(reg_name);
        else if(reg_size < tmpReg.size())
            return sleigh_masked_reg_translate_PPC64(reg_name, reg_size);
        else
            throw maat::runtime_exception("Register translation from SLEIGH to MAAT, sleigh pcode register size is greater than maat register size!");
    }
    else
        throw maat::runtime_exception("Register translation from SLEIGH to MAAT not implemented for this architecture!");
}

Ill push my changes soon to my private branch if you want to see it