[FEATURE] AntiDebug Software Breakpoints detection

cecio commented 1 year ago

I added a new detection for the AntiDebug feature, and in particular the Softfware Breakpoints detection.

Basically I tried to go through https://anti-debug.checkpoint.com/techniques/process-memory.html#breakpoints and noticed that most of the detection can be summarized in the checks for the presence of byte 0xCC somewhere. So I implemented a check for this byte for comparison statements (it should work for immediate, register and referenced memory compare).

I put this under AntiDebug lelvel 2 (Deep), since it could slow down and may be lead to some false positives.

Let me know if you think that any rework is needed.

Thanks a lot!

hasherezade commented 1 year ago

hi @cecio ! Sorry for the late response. I finally found some time to test your feature. At first I was worried that it's gonna introduce a significant slowdown in tracing, but it's actually not that bad. However, the feature itself doesn't seem to work as expected - it is very over-reactive. I tried to trace Al-Khasher with it, and it labeled multiple lines as "anti-stepover", that were irrelevant. For example:

overreactive

I can understand some false positives in case if accidentally, a comparison with 0xCC byte is done for other purposes than anti-debug, but here the byte labeled wasn't even 0xCC, which suggests that something is wrong with the implementation itself. Please have a closer look at why is it happening. I will also be testing it deeper, and let you know.

cecio commented 1 year ago

hey @hasherezade

No worries at all, no hurry on this :-)

Mmmmmhh....yeah, I have a couple of ideas about what is going on and may be this need to be adjusted. The comparison you pointed out could trigger the alarm if in RAX we have 0xCC, but it in this case it's pointless because comparing with an immediate, so it's for sure a false positive. But may be it's also something else.

Let me run some more tests and come back with more details and a more robust solution. Thanks in the meantime :-)

hasherezade commented 1 year ago

It seems to me that anyways, we should just check if the second argument, to which the first argument is compared, is 0xCC, rather than checking all the arguments. This is how the compiled snippet from the Checkpoint's example looks like in the compiled version:

compare_imm

So the whole checking function can be shorten to:


VOID AntiDbg::WatchCompareSoftBrk(const CONTEXT* ctxt, ADDRINT Address, ADDRINT insArg)
{
    PinLocker locker;

    const WatchedType wType = isWatchedAddress(Address);
    if (wType == WatchedType::NOT_WATCHED) return;

    INS ins;
    ins.q_set(insArg);

    if (INS_OperandCount(ins) < 2) {
        return;
    }

    bool isSet = false;
    const UINT32 opIdx = 1;

    if (INS_OperandIsImmediate(ins, opIdx))
    {
        UINT8 val = 0;
        if ((val = (INS_OperandImmediate(ins, opIdx) & 0xFF)) == 0xCC)
        {
            isSet = true;
        }
    }

    if (isSet) {
        LogAntiDbg(wType, Address, "Software Breakpoint comparison",
            "https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over");
    }
}

Please try to trace my my demo application antianalysis_demos (I added that testcase: https://github.com/hasherezade/antianalysis_demos/blob/master/main.cpp#L62), and have a look at the output.

What also distinguish such cases, is that, the same comparison is called over and over, at the same offset (because usually the breakpoints are checked not at a single address, but through the whole function/code block). So we can also use the repetition as an indicator. And log it only once, i.e. at third attempt, in order not to slow down too much and spam the tracelog with repeated lines.

cecio commented 1 year ago

You are right.

I did also some tests on it, and actually the compiler always translate these kind of comparisons as immediate value (even if you set the 0xCC as a stack variable).

In order to test the other cases I actually had to trick the compiler in producing some different code (like referenced values), with some ugly hacks. So, we can be reasonably sure that if the code is produced by a compiler, it will use an immediate value. But I initially thought to cover also "manual checks" (may be done inlining assembly code) done in a different way. We can decide if we want to be more conservative here (ignoring different cases) or more aggressive, with the risk of producing more false positives. Do you think it's better to go with the "immediate value" check only?

Thanks for your feedback!

hasherezade commented 1 year ago

Yes, I think it is better to go with the immediate value, because other cases are too rare to be worth it. But in any case, only operand 1 should be checked.

BTW, still I see some problem, that even if I specifically check if the immediate value is being used, it is not always reliable. In the case below, the comparison with RBP register was done, and Pin for some reason returned INS_OperandIsImmediate as true (this was the version of the function from the snippet I presented above)... I need to dig deeper to see why it is happening.

not_imm

Edit: I filtered out such cases by checking the operand size. Still it doesn't answer the question why in the first place Pin identified it as Immediate, while it was a register. Yet, the check for the operand size is a reasonable constraint here. This is how I check it now:

if (INS_OperandIsImmediate(ins, opIdx) && INS_OperandSize(ins, opIdx) == sizeof(UINT8))

And, just a sidenote - having this antidebug technique covered in TinyTracer isn't a very high priority. Due to the fact that Pin doesn't set breakpoints during tracing, this technique with 0xCC checks isn't able to disrupt Pin anyways. Listing them is only to provide an information about what problems may occur during debugging of such application.

hasherezade commented 1 year ago

@cecio - and this is my reworked version, please have a look:


/* ==================================================================== */
// Callback function to be executed when a compare is executed
/* ==================================================================== */

std::map<ADDRINT, size_t> cmpOccurrences;
VOID AntiDbg::WatchCompareSoftBrk(const CONTEXT* ctxt, ADDRINT Address, ADDRINT insArg)
{
    PinLocker locker;
    const WatchedType wType = isWatchedAddress(Address);
    if (wType == WatchedType::NOT_WATCHED) return;

    INS ins;
    ins.q_set(insArg);
    if (!ins.is_valid() || INS_OperandCount(ins) < 2) {
        return;
    }

    bool isSet = false;
    const UINT32 opIdx = 1;

    if (INS_OperandIsImmediate(ins, opIdx) && INS_OperandSize(ins, opIdx) == sizeof(UINT8))
    {
        UINT8 val = 0;
        if ((val = (INS_OperandImmediate(ins, opIdx) & 0xFF)) == 0xCC)
        {
            cmpOccurrences[Address]++;
            if (cmpOccurrences[Address] == 3) isSet = true;
        }
    }

    if (isSet) {
        LogAntiDbg(wType, Address, "Software Breakpoint comparison",
            "https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over");
    }
}

From what I checked it works fine. It tags what it's supposed to, and only once, not generating too much noise.

tag_once

Fragment:

30bd0;kernelbase.FlsGetValue
e04b;kernel32.SetLastError
1413c;ntdll.RtlLeaveCriticalSection
9f20;[ANTIDEBUG] --> Software Breakpoint comparison;https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over
14130;ntdll.RtlEnterCriticalSection
14130;ntdll.RtlEnterCriticalSection

I need to doublecheck it with the latest version of Al-Khaser now...

cecio commented 1 year ago

Looks great!

I'll run some tests with Al-Kasher as well (I'll do it this evening, now I need to leave the keyboard for a while :-)).

I have a doubt about the incorrect immediate check done by PIN you noticed: may be this has something to do with the INS passed as parameter: I did it to avoid to clutter the InstrumentInstruction in TinyTracer.cpp, but I'd like to check if it is actually the issue here.

hasherezade commented 1 year ago

@cecio - I checked with Al-Khaser, and at first I thought is was a bug, because although the implementation of this technique is there: https://github.com/LordNoteworthy/al-khaser/blob/master/al-khaser/AntiDebug/SoftwareBreakpoints.cpp - the tracer wasn't detecting it. However, upon a closer look I realized that this check is in fact a dead code, and never executed, so there is nothing to detect! I added a log line:

BOOL SoftwareBreakpoints()
{
    //NOTE this check might not work on x64 because of alignment 0xCC bytes
    size_t sSizeToCheck = (size_t)(Myfunction_Adresss_Next)-(size_t)(My_Critical_Function);
    PUCHAR Critical_Function = (PUCHAR)My_Critical_Function;
    printf("Checking breakpoints in the area of size: %x\n", sSizeToCheck);
    for (size_t i = 0; i < sSizeToCheck; i++) {
        if (Critical_Function[i] == 0xCC) // Adding another level of indirection : 0xCC xor 0x55 = 0x99
            return TRUE;
    }
    return FALSE;
}

And I see that the function to be checked is too short, and due to inlining or some optimization, the area that was supposed to be check is of size 0:

al-khaserchecks

I made a modification in Al-Khaser to ensure that the check will not be optimized out:

https://gist.github.com/hasherezade/38377585dc6f0b1bbdb80bd47e046261

And now the check is detected:

detected

At the same time, the earlier false positives are not detected, so all seems good.

hasherezade commented 1 year ago

@cecio - if you can give me a permission, I will push my reworked branch to your repository. Then you can test it, and if all is good, we will merge to the main repo. (I think this will be cleaner than me creating additional forks).

cecio commented 1 year ago

@hasherezade sure! I added you as collaborator on the repo.

hasherezade commented 1 year ago

thank you @cecio ! I pushed my branch as: https://github.com/cecio/tiny_tracer/tree/cc_fixed

hasherezade commented 1 year ago

BTW, the function that I changed in Al-Khaser will cause it to report Software Breakpoints being set ([BAD]), but this is because the 0xCC byte is a part of the code:

fp1

...and anyways, there is the INT3 padding added between this function and the next function that is used as the area terminator, and the implementation of this technique does not account for this, checking the area from the beginning of the first function, till the beginning of the other, not stopping on RET:

size_t sSizeToCheck = (size_t)(Myfunction_Adresss_Next)-(size_t)(My_Critical_Function);

func_end

But anyways, this is more about Al-Khaser, and not directly related to TinyTracer. The point is, TinyTracer detected the comparison where it was supposed to detect it.

cecio commented 1 year ago

@hasherezade your implementation works great.

All the test I ran worked fine. I tried to see if I was able to understand the issue of the misinterpreted INS_OperandIsImmediate, but no luck yet. Anyway your workaround is perfect. But I'll try to look again into it.

I merged your branch on the master. Thanks a lot!

hasherezade commented 1 year ago

@cecio - I am happy to hear! thank you for testing and for the merge. The issue with INS_OperandIsImmediate is a bit weird, and I will keep an eye on this if it doesn't cause any issues in the future. Maybe I will make some more tests when I get time. But from what I saw, it doesn't look like it is causing major problems. So, I am gonna merge it as is. Again, thank you for your contribution! :)

cecio commented 1 year ago

Thanks to you @hasherezade !

hasherezade commented 1 year ago

@cecio - I am still checking what caused that INS_OperandIsImmediate, and for the test, I disabled the counter, and also, now I print the disasembly of the instruction:

    if (isSet) {
        std::stringstream ss;
        ss << INS_Disassemble(ins);
        ss << " : Software Breakpoint comparison";
        LogAntiDbg(wType, Address, ss.str().c_str(),
            "https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over");
    }

The output for the incorrectly identified instruction is indeed incorrect and weird:

29070;[ANTIDEBUG] --> mov rdx, 0x7ff63d4636cc : Software Breakpoint comparison;https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over

disasm2

The disasm for the correct check looks better, but also the register is different SIL instead of CL:

9f20;[ANTIDEBUG] --> cmp sil, 0xcc : Software Breakpoint comparison;https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over

cmp1

So it seems that way of passing the Instruction (via ins.q()) didn't work as expected... I will see what can be done about it.

cecio commented 1 year ago

@hasherezade Yes, that was what I was suspecting as well. I was trying to reproduce the problem.

TBH we can avoid it: I used this trick to avoid to put too many code in InstrumentInstruction. But now that the checks are simplified, we can also put them in the InstrumentInstruction.

hasherezade commented 1 year ago

@cecio - what do you propose to use instead of ins.q()? BTW, I continue my tests, I see that in case of that invalid instruction, the address retrieved by INS_Address(ins) is also invalid (-1):

260ea;[ANTIDEBUG] --> VA = ffffffffffffffff : RVA = ffffffffffffffff : mov rcx, 0x7ff63d477ecc : MOV : Software Breakpoint comparison;https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over

while in case of the line where the correct check was made, the address is as the actual:

9f20;[ANTIDEBUG] --> VA = 7ff63d469f20 : RVA = 9f20 : cmp sil, 0xcc : CMP : Software Breakpoint comparison;https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over

Used code:

    if (isSet) {
        ADDRINT addr = INS_Address(ins);
        ADDRINT rva = addr_to_rva(addr);
        std::stringstream ss;
        ss << "VA = " << std::hex << addr;
        ss << " : ";
        ss << "RVA = " << std::hex << rva;
        ss << " : ";
        ss << INS_Disassemble(ins);
        ss << " : ";
        ss << INS_Mnemonic(ins);
        ss << " : Software Breakpoint comparison";
        LogAntiDbg(wType, Address, ss.str().c_str(),
            "https://anti-debug.checkpoint.com/techniques/process-memory.html#anti-step-over");
    }

So checking for those incorrect addressed can be used to filter out those invalid instructions. But still, it is not a perfect solution. And the correct code for those offsets is missing.

hasherezade commented 1 year ago

@cecio - I tried doing it another ways round, and it seems to work:

        if (m_Settings.antidebug >= ANTIDEBUG_DEEP) {
            // Check all comparison for 0xCC byte (anti stepinto/stepover checks)
            if (INS_Opcode(ins) == XED_ICLASS_CMP && INS_OperandCount(ins) >= 2 && INS_OperandIsImmediate(ins, 1)) {
                INS_InsertCall(
                    ins,
                    IPOINT_BEFORE, (AFUNPTR)AntiDbg::WatchCompareSoftBrk,
                    IARG_FAST_ANALYSIS_CALL,
                    IARG_INST_PTR,
                    IARG_ADDRINT, INS_OperandImmediate(ins, 1),
                    IARG_END);
            }
        }

Is it what you meant?

cecio commented 1 year ago

@hasherezade yes exactly.

Even because I think it could be a bit complicated. Looking at this post

https://stackoverflow.com/questions/55382659/intel-pin-analysis-routine-detects-ah-register-instead-of-rsp-reg-stack-ptr

looks like that INS at instrumentation time is not guaranteed to be kept at analysis time, especially if other instrumentation routines are called (which explain why I have different results in different tests I'm doing) The post says that the only way is to copy the actual bytes of the instructions in a buffer and then analyze them.

But, at this point I don't know if it make sense. I think that the checks at Instrumentation level are clean enough (and effective as well).

hasherezade commented 1 year ago

@cecio - this is what I did: https://github.com/hasherezade/tiny_tracer/commit/05a820c58fe79f90fe10e1a5d097087908be1220 It should be fine now, that invalid instruction no longer appears. It passed my quick tests. Please have a look too, if it passes all your tests.

And yeah, I am agree with you that we should avoid passing instruction to the instrumentation function.

cecio commented 1 year ago

@hasherezade

I saw that you left the size check in the instrumentation function

&& INS_OperandSize(ins, opIdx) == sizeof(UINT8)

If I try to run your demo (64 bit compiled) antianalysis_demo.exe I see the following:

In this case the immediate is 0xCCCCCCCC, the size check is going to fail. I think that this check can be removed at this point, the INS_OperandIsImmediate should be reliable now.

Did you see cases where this check is still required?

hasherezade commented 1 year ago

@cecio - I was thinking about removing this check, but then I decided to leave it, because anyways searching for breakpoints usually happens byte by byte. We cannot expect that breakpoints will be set as multiple in a row. The example you showed - are you sure that this is really related to searching for a breakpoint? Because it doesn't look so, maybe it is a different function... This is how the function that searches for breakpoints looks like (64-bit):

anti_check64

This is the binary that I used for the test: https://ci.appveyor.com/project/hasherezade/antianalysis-demos/build/job/1341hhwduqgffy0u/artifacts I have another one build locally, on different version of Visual Studio, and the function looks similarly.

64bit_check

Can you share with me the binary that you used? I will have a look what this check really is. You can drop it here...

cecio commented 1 year ago

@hasherezade my bad, you are right!!! That's actually another thing sorry!

I need to sleep, it was pretty clear it couldn't be that check also looking at the instruction

So, I think it works like a charm then.

Sorry....

hasherezade commented 1 year ago

@cecio - no worries! It is good you asked and we clarified this. and yeah, we worked till pretty late on this one. hope you had a good sleep. thanks!

cecio commented 1 year ago

@hasherezade yeah great! :-) Hope you as well! Thanks a lot!!

hasherezade / tiny_tracer

[FEATURE] AntiDebug Software Breakpoints detection #43