Closed ElektroKill closed 2 years ago
Thanks for the work.
Would it be possible for you to provide a test binary that has this feature enabled? I would like to test this out locally some more, and see what we can do to remove the prologue from the AST as well.
Sample file protected with KoiVM with the SMC transform enabled: SMC.zip
What is considered a "SMC header block"? Comments indicate it is a block starting with two NOPs, but in the graph visualization posted this is not the header of the function. Furthermore, it seems what is tested is the block after the decryption routine. I could piece some of it together from the format specified in the PR description, but this should be part of the code for future reference as well.
In my latest commit, I added more comments to the code to help clarify what is actually happening. The only block that is currently analyzed by the disassembly logic is the encrypted trampoline block. This is the block that contains the two NOPs, an XOR operation, and an unconditional jump. The new comments hopefully clarify where the magic constants come from.
The biggest "complaint" I'd have is that the proposed SMC detection is based on quite a weak heuristic. The test is done only on some instructions within a single block (the trampoline block), and is only checking for double NOP, a (very weak) heuristic for the XOR pattern, followed by a jump. Effectively, this completely ignores the blocks that actually implement the self-modifying-code part (i.e., the decryption block). This opens up for many FPs that can easily be crafted, even with pretty normal C#/CIL code without any big changes to KoiVM itself (just make a block that exactly implements an xor assignment and jump in CIL).
I don't necessarily have a problem with weak heuristics. In fact, OldRod already uses some weak heuristics (e.g.,
IExitKeyResolver
andIFrameLayoutDetector
). Furthermore, this new one probably works for most SMC-enabled bins as well. However, I feel that heuristics should be made configurable, via an interface or command line argument, since they are not part of the "core" spec of KoiVM's architecture. Especially given the high FPR that this implementation has, I think it should be possible to disable it as well.
While the heuristic, is rather weak when it comes to the instruction pattern, that's not the only thing it relies on. The primary reason for the instruction patterns is to make sure that the magic SMC key that was read at the beginning of the IsSMCTrampoline
method can be used to decode valid instructions. For non-SMC code, the previous byte will be garbage, part of a different instruction, etc., and the chances that using that byte as the key results in reading the correct instructions to satisfy the pattern should be rather low. As for KoiVM modifications, crafting a block that when decrypted with the previous byte used as the key results in the correct instruction pattern is much harder than crafting a block with just those instructions.
Using the previous blocks which actually implement the self-modifying code into the heuristic is much harder as the heuristic is performed during the disassembly stage where we do not have access to flow graphs or any higher-level implementation which would allow for analyzing the incoming blocks.
A note on perf:
IsSMCTrampoline
is invoked for ~every instruction~ every block header within the virtualized body. This is very expensive, especially for non SMC-enabled methods, given that the heuristic has the potential to read at least 200 bytes ahead every time. It is also probably unnecessary given that the SMC initialization blocks are always only at the beginning of a virtualized function.
The chance of actually reading the full of 200 bytes is unlikely as that would require the instruction bytes xor'ed with the byte before the first instruction of the block to result in valid instructions. As for limiting the amount of calls to IsSMCTrampoline
, I don't really have any ideas on how to implement better logic with the information available during the disassembly stage.
Final nitpicky comment is that the trampoline detector is now unset by default, which can cause a null reference exception if the pipeline project is not initializing it. It currently is, but that may change in the future (and sadly no NRT in the OldRod project for us to point that out). We probably want to add a null-check in the places the detector is used.
Othwerise, LGTM.
Done
Thanks a lot! And sorry for the relatively slow responses 😅
The open source release of KoiVM includes a feature that is not enabled by default which aims to make disassembly harder. It does this by introducing self-modying code. It works by inserting a special prologue to the function body. The prologue is partially encrypted and the decryption is handled at runtime. Pseudocode of the prologue code:
The code in the
Trampoline
block in the above Pseudocode is decrypted at runtime insideLoopBody
.Control flow graph of the raw disassembled code (User code starts at IL_11B1)
After these changes, OldRod is able to fully disassemble code that had the aforementioned transform applied. However, the output of the devirtualization does not run due to the inclusion of the of the translated SMC prologue as can be seen in the image below: The user code begins with the call to
Application.EnableVisualStyles();
.