Open secrary opened 5 years ago
There is not enough information here. Please provide the code for a minimal client and minimal application. Details on exactly what this client is doing are needed. "The client crashes": where? Is it registered for the exception event? Is it failing to tell DR to pass the exception on to the app? Without details I would guess that this is a bug in the client.
My mistake, the crashing reason is EXCEPTION_SINGLE_STEP, not EXCEPTION_ACCESS_VIOLATION. ( setting PAGE_NOACCESS trick is not related to the crash)
Malware I was talking about also sets Trap flag and EXCEPTION_SINGLE_STEP is handled by the registered handler.
I re-implement setting Trap flag part of the code:
LONG WINAPI
handler(
struct _EXCEPTION_POINTERS* ExceptionInfo
)
{
printf("err: 0x%lx\n", ExceptionInfo->ExceptionRecord->ExceptionCode);
return EXCEPTION_CONTINUE_EXECUTION;
}
int __cdecl _tmain(int argc, TCHAR **argv)
{
// register a handler
AddVectoredExceptionHandler(1, handler);
// set TF
__asm
{
pushfd
or dword ptr[esp], 0x100 // set the Trap Flag
popfd // Load the value into EFLAGS register
nop
mov eax, eax; // trigger EXCEPTION_SINGLE_STEP
}
return 3;
}
but this one executes without crashing the client, I don't know what's the reason, is there a way to print EIP at the moment of a client crash (on the crash window there is no EIP register) ?
Which version of DR are you using? There was a problem with single-step which was solved in 47b56c854bd8d4137dad54e5cc124ab983ecb370 in PR #2295 for #2144. A test was added. If you have that patch in place, please examine the test and determine what is needed to be added to reproduce this problem.
I'm using the latest version of DR. Reproduced code I created works fine, which I think should fail. If you have a separated Windows VM to run the malware under a DR client you will get the same error I'm getting (all DR client I've tested crashes)
SHA256 of the malware: 7AA84B4CE4FBF937632D3008981C3EF8FF63E1FF846FDBB55060F3973D2507A9 It's inside ...artifacts.zip file: https://www.malware-traffic-analysis.net/2019/07/22/index.html
As the debugging tips at https://github.com/DynamoRIO/dynamorio/wiki/Debugging#debugging-tips-1 suggest, please run without any client, and please run debug build.
So it sounds like you have an attempt at a minimal reproducer which actually works fine under DR, and we have a regression test of an app using a single step exception which also works fine. So there is something more to it here. I would suggest using debug build logs to narrow down what is happening.
is there a way to print EIP at the moment of a client crash (on the crash window there is no EIP register) ?
The EIP value is printed in a DR crash report as the crashing PC address, like this: internal crash at PC 0x00007fbadb094c36
.
You keep talking about a "client crash": as suggested, please run without any client. Does that work? I.e., is this something in core DR, or something either related to client interface support inside DR or something caused by the client's code itself?
Without any client getting very similar crash report:
With -debug
build getting bit more information:
Once it's established that the problem occurs with no client, continue running with no client (i.e., in the -debug run): no reason to introduce complexity back. The goal is to minimize.
I would suggest running with -loglevel 3
or -loglevel 4
in debug build and looking at the precise sequence of events to figure out what is going on: how is this different from our regression test or your simple code.
Timeout with -loglevel 3
@derekbruening "Any help you could provide...": While I have read your thesis and done some other research, I've only just started playing around with DR. I managed to create my own debug build, but that's about it right now. Would not mind trying to help, just don't know how...
Let me start with this: I also have a different approach for setting the trap flag (TF) and causing a single-step exception (that brings down DR, with 2 dump files this time). No popf
used, and thus inserting a nop
to keep the exception in the cache (as mentioned in #2144) may not be an option (and something still doesn't seem to work correctly or this bug here would not exist). Should I describe the approach in more details in here or file a separate ticket?
Should I describe the approach in more details in here or file a separate ticket?
Either way seems fine to me: making this an umbrella case for missing trap flag support, or separating.
I'll stick to this ticket for now. The x86 code to cause this 'other' issue with the trap flag goes something like this:
int ExceptionFilter(_In_ EXCEPTION_POINTERS * pEx)
{
if (STATUS_PRIVILEGED_INSTRUCTION == pEx->ExceptionRecord->ExceptionCode)
{
++pEx->ContextRecord->Eip; // Skip HLT instruction.
pEx->ContextRecord->EFlags |= 0x100; // Single step next instruction.
return EXCEPTION_CONTINUE_EXECUTION;
}
else if (STATUS_SINGLE_STEP == pEx->ExceptionRecord->ExceptionCode)
{
return EXCEPTION_CONTINUE_EXECUTION; // Simply absorb.
}
return EXCEPTION_CONTINUE_SEARCH;
}
int main()
{
__try
{
__asm hlt
printf("Success!\n");
}
__except (ExceptionFilter(GetExceptionInformation()))
{
printf("Failure!\n");
}
return 0;
}
Interestingly enough DR does produce a fatal dump file in this case before the process goes down. Samples log & dump files generated during a 32 bit run are attached here: logs and dumps.zip. The exact sources (VS project) is here: FilterSingleStep.zip
Some malware changes page protection to PAGE_NOACCESS and executes a code from the code, which causes to trigger a handler, usually, the handler is registered beforehand and the handler just changes protection back to RX (or RWX) and continues execution.
When a process is under DR client, the client crashes with EXCEPTION_ACCESS_VIOLATION error instead of passing the exception to the process and continue execution.
*more about the trick: https://secrary.com/Random/anti_re_simple/