Washi1337 / OldRod

An automated KoiVM disassembler and devirtualisation utility
GNU General Public License v3.0
350 stars 80 forks source link

Add support for disassembling code which had KoiVM's SMC transform applied #63

Closed ElektroKill closed 2 years ago

ElektroKill commented 2 years ago

The open source release of KoiVM includes a feature that is not enabled by default which aims to make disassembly harder. It does this by introducing self-modying code. It works by inserting a special prologue to the function body. The prologue is partially encrypted and the decryption is handled at runtime. Pseudocode of the prologue code:

EntryStub:
{
    counter = TrampolineBlockLength;
    pointer = &Trampoline;
    pointer -= 1;
    t1 = *pointer;
    if (t1 == 0) {
        goto Trampoline;
    }
    if (counter != 0) {
        goto LoopBody;
    }
    t3 = &RealEntry;
    goto Trampoline;
}

LoopBody:
{
    t2 = *pointer;
    t2 ^= TrampolineCodeKey;
    *pointer = t2;
    counter -= 1;
    pointer += 1;
    if (counter != 0) {
        goto LoopBody;
    }
    t3 = &RealEntry;
    goto Trampoline;
}

Trampoline:
{
    t3 ^= AddressKey;
    goto RealEntry;
}

RealEntry:
{
    // Some code here
}

The code in the Trampoline block in the above Pseudocode is decrypted at runtime inside LoopBody.

Control flow graph of the raw disassembled code (User code starts at IL_11B1)

After these changes, OldRod is able to fully disassemble code that had the aforementioned transform applied. However, the output of the devirtualization does not run due to the inclusion of the of the translated SMC prologue as can be seen in the image below: image The user code begins with the call to Application.EnableVisualStyles();.

Washi1337 commented 2 years ago

Thanks for the work.

Would it be possible for you to provide a test binary that has this feature enabled? I would like to test this out locally some more, and see what we can do to remove the prologue from the AST as well.

ElektroKill commented 2 years ago

Sample file protected with KoiVM with the SMC transform enabled: SMC.zip

ElektroKill commented 2 years ago

What is considered a "SMC header block"? Comments indicate it is a block starting with two NOPs, but in the graph visualization posted this is not the header of the function. Furthermore, it seems what is tested is the block after the decryption routine. I could piece some of it together from the format specified in the PR description, but this should be part of the code for future reference as well.

In my latest commit, I added more comments to the code to help clarify what is actually happening. The only block that is currently analyzed by the disassembly logic is the encrypted trampoline block. This is the block that contains the two NOPs, an XOR operation, and an unconditional jump. The new comments hopefully clarify where the magic constants come from.

The biggest "complaint" I'd have is that the proposed SMC detection is based on quite a weak heuristic. The test is done only on some instructions within a single block (the trampoline block), and is only checking for double NOP, a (very weak) heuristic for the XOR pattern, followed by a jump. Effectively, this completely ignores the blocks that actually implement the self-modifying-code part (i.e., the decryption block). This opens up for many FPs that can easily be crafted, even with pretty normal C#/CIL code without any big changes to KoiVM itself (just make a block that exactly implements an xor assignment and jump in CIL).

I don't necessarily have a problem with weak heuristics. In fact, OldRod already uses some weak heuristics (e.g., IExitKeyResolver and IFrameLayoutDetector). Furthermore, this new one probably works for most SMC-enabled bins as well. However, I feel that heuristics should be made configurable, via an interface or command line argument, since they are not part of the "core" spec of KoiVM's architecture. Especially given the high FPR that this implementation has, I think it should be possible to disable it as well.

While the heuristic, is rather weak when it comes to the instruction pattern, that's not the only thing it relies on. The primary reason for the instruction patterns is to make sure that the magic SMC key that was read at the beginning of the IsSMCTrampoline method can be used to decode valid instructions. For non-SMC code, the previous byte will be garbage, part of a different instruction, etc., and the chances that using that byte as the key results in reading the correct instructions to satisfy the pattern should be rather low. As for KoiVM modifications, crafting a block that when decrypted with the previous byte used as the key results in the correct instruction pattern is much harder than crafting a block with just those instructions.

Using the previous blocks which actually implement the self-modifying code into the heuristic is much harder as the heuristic is performed during the disassembly stage where we do not have access to flow graphs or any higher-level implementation which would allow for analyzing the incoming blocks.

A note on perf: IsSMCTrampoline is invoked for ~every instruction~ every block header within the virtualized body. This is very expensive, especially for non SMC-enabled methods, given that the heuristic has the potential to read at least 200 bytes ahead every time. It is also probably unnecessary given that the SMC initialization blocks are always only at the beginning of a virtualized function.

The chance of actually reading the full of 200 bytes is unlikely as that would require the instruction bytes xor'ed with the byte before the first instruction of the block to result in valid instructions. As for limiting the amount of calls to IsSMCTrampoline, I don't really have any ideas on how to implement better logic with the information available during the disassembly stage.

Washi1337 commented 2 years ago

Final nitpicky comment is that the trampoline detector is now unset by default, which can cause a null reference exception if the pipeline project is not initializing it. It currently is, but that may change in the future (and sadly no NRT in the OldRod project for us to point that out). We probably want to add a null-check in the places the detector is used.

Othwerise, LGTM.

ElektroKill commented 2 years ago

Done

Washi1337 commented 2 years ago

Thanks a lot! And sorry for the relatively slow responses 😅