DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.61k stars 554 forks source link

[APP CRASH] Execution terminates due to unmet condition in core\vmareas.c #6567

Open NicoBottura opened 8 months ago

NicoBottura commented 8 months ago

Describe the bug An execution of the two attached programs under drrun.exe ends prematurely likely due to an engine crash, as a client cannot intercept it as an exit event or exception. In one program the error is deterministic, whereas in the other (with “nd” in the executable name) is non-deterministic.

Running with -debug reveals an assertion failure in core\vmareas.c:10857: ASSERT(!info.overlap || (f != NULL && TEST(FRAG_IS_TRACE, f->flags)));

image11

Without the -debug option either program exits silently:

image9

Albeit an executable protector (Themida) was used, we manually dissected its actions and no query/instruction should be problematic for DynamoRIO. Additional information is provided below. We believe instead this may be hitting an engine bug possibly of general interest for DynamoRIO users.

To Reproduce The two executables are benign and contain a "Hello World" application that spawns a message box. The original code is as follows:

#include <Windows.h>
#pragma comment(lib, "user32.lib")
int main() {
    MessageBox(NULL, "Hello World!", ":)", MB_OK);
    return 0;
}

To run the program we used the empty client provided in samples\bin32\ folder in every DR release, optionally with -debug option enabled: .\bin32\drrun.exe -debug -c .\samples\bin\empty.dll – HelloWorld.exe

PoCs.zip (note: Windows Defender may flag them as trojan due to Themida)

Expected behavior Either program should display the message box and exit normally.

image10

Versions DynamoRIO v10.0.0. 32-bit application protected with Themida v2.3.7.0 on Windows 10 Home version 22H2 x64 with Intel(R) Core(™) i7-8750H CPU @ 2.20GHz. The issue seems to affect earlier versions too (we tested a few since 6.x).

Additional context The executables are protected with Themida, but in our analysis this seems unrelated to explicit anti-analysis actions. To build the two programs we used Themida v2.3.7.0: the program leading to a deterministic error contains anti-debug protections whereas the other does not. Both programs run fine under other DBI engines (we tested a recent version of Pin).

The problem we report seems related to issue #2294 but in our case the bug occurs both with and without supplying -debug; furthermore, we do not observe the other assertion violations that the report mentions.

We verified that Themida’s anti-debug tricks (e.g., int 1, hiding a thread from a debugger with NtSetInformationThread and a couple of other classics) are handled gracefully by DynamoRIO and no client-side fixing should be needed. We are quite confident about this after extensive testing and debugging prior to this report. We tested different Themida versions and protections, with the reported crash occurring even when none is active (also, Themida would show a popup when some artifact is found).

dcdelia commented 2 months ago

Hi @derekbruening do you have suggestions on how to approach this?

I'm facing similar issues in an ongoing research project. If the cause is self-modifying code as hinted in #2294, I'm not sure it is only about race conditions, because I experience this assertion failure deterministically both with and without debug mode on multiple DynamoRIO versions. Also, with other executable protectors that use SMC, normally DynamoRIO does fine.

Clues on what may be causing the assertion failure in vmareas.c or how to circumvent it with some settings? Thank you :-)

derekbruening commented 2 months ago

Probably this will require getting into the weeds and reconstructing the detailed sequence of events from the log files or maybe additional added diagnostics. The base mechanisms in DR have not changed in a long time and should still match the cache consistency paper (there is an experimental branch speeding codemod handling up but it was never merged).

Runtime options: there are some that can turn off the code mod handling but if correctness requires the modified code the app will not run correctly: but it could be a sanity check that DR doesn't fall over. -no_sandbox_writes, -no_hw_cache_consistency. There are also some triggers between page prot and sandbox instru: -ro2sandbox_threshold, -sandbox2ro_threshold.

I would suggest starting with debug logs to try to understand what is happening. The assert is 586K fragments in: could use -log_at_fragment_count to stay at loglevel 1 until closer to the assert point and then go up to 4 (or 3 might be enough). Is there only one thread? Can this be boiled down to a very short sequence of app code?