unicorn-engine / unicorn

Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)
http://www.unicorn-engine.org
GNU General Public License v2.0
7.33k stars 1.31k forks source link

Segfault while mem_writing inside a mem read hook. #1865

Open futhewo opened 11 months ago

futhewo commented 11 months ago

Hello,

First, thanks for the great tool.

I wanted to do some kind of man-in-the-middle on memory accesses. But under certain conditions, using uc_mem_write() inside a mem read hook leads to a segfault inside unicorn code. I am using unicorn engine as of commit 7e4754ad008f0dac6d990a7a997a764062b35d04.

Please find below a minimal code that reproduces it:

#include <unicorn/unicorn.h>

#define ENTRYPOINT 0x10000001
#define SEGM0_ADDR 0x10000000
#define SEGM1_ADDR 0x10001000

char arm_code[] = { 209, 248, 0, 2, 10, 70 }; // ldr r0, [r1, #0x200]; mov r2, r1

static int _hook_mem_read(uc_engine* p_uc, uc_mem_type type, uint64_t addr, uint32_t size, int64_t value, void* p_user_data) {
    uint32_t data = 0xDEADBEEF;
    uc_mem_write(p_uc, addr, &data, size);
    return 1;
}

int main() {
    uc_engine* p_uc;
    uc_open(UC_ARCH_ARM, UC_MODE_THUMB | UC_MODE_LITTLE_ENDIAN, &p_uc);

    uc_mem_map(p_uc, SEGM0_ADDR, 0x1000, UC_PROT_EXEC | UC_PROT_READ);
    uc_mem_map(p_uc, SEGM1_ADDR, 0x1000, UC_PROT_READ);
    uc_mem_write(p_uc, SEGM0_ADDR, arm_code, sizeof(arm_code));

    uc_hook hook_mem_read;
    uc_hook_add(p_uc, &hook_mem_read, UC_HOOK_MEM_READ, _hook_mem_read, 0, 1, 0);

    uint32_t r1 = SEGM0_ADDR + 0x100;
    uc_reg_write(p_uc, UC_ARM_REG_R1, &r1);

    uc_emu_start(p_uc, ENTRYPOINT, 0, 0, 0);
    uc_close(p_uc);

    return 0;
}

In that example, if r1 points in SEGM1, the bug disappears. But in some way more complicated cases, the bug exists while loading an address from a non-executable segment far away from the current executed instruction.

The crash is a segfault on the instruction below under helper_le_ldul_mmu_arm code, after the hook is called. mov eax, DWORD PTR [rax] Where, weirdly, rax is the address 'addr' received by the hook -1.

Maybe I am not using unicorn as it should be. If anyone as a better idea on how to do this mitm, I would greatly appreciate.

Thank you all.

wtdcode commented 11 months ago

I have the feeling of relating to #1804 but no clue at this moment. Need more investigation.

futhewo commented 11 months ago

@wtdcode thanks!

To anyone having this problem, I circumvented it by patching unicorn UC_HOOK_MEM_READ_AFTER callback signature so it uses, as parameter, a pointer to the value read (res) instead of the value itself. So, I can modify the value in the callback. It does not update the memory, but that's manageable.

@wtdcode I can provide you a patch or something if you are interested.

wtdcode commented 11 months ago

This is not the expected way to use API though it could work for a while. Therefore we won’t accept your patch unfortunately. But it’s a bit weird that _AFTER doesn’t crash and I will have a look this week.

futhewo commented 11 months ago

It does not crash because, thanks to the aforementioned patch, I do not need to use uc_mem_write anymore. My hook looks like:

static int _hook_mem_read_after(uc_engine* p_uc, uc_mem_type type, uint64_t addr, uint32_t size, int64_t* p_value, void* p_user_data) {
    *p_value = 0xDEADBEEF;
    return 1;
}

And the mitm is done.

duyntk2000 commented 1 month ago

Hi @wtdcode I believe the problem lies in the fact that if the memory is not writable (doesn't have UC_PROT_WRITE) then in uc_mem_write(), uc->readonly_mem() will be triggered to mark the zone writable.

This leads to execution of: -memory_region_set_readonly ->memory_region_transaction_commit ->tcg_commit ->tlb_flush

The flush of TLB cache make the calculation of haddr incorrect in load_helper(), hence accessing the wrong address:

//In .../unicornafl/unicorn/qemu/accel/tcg/cputlb.c
load_helper(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, uintptr_t retaddr, MemOp op, bool code_read, FullLoadHelper *full_load) {
    ...
    CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
    ...
    //Execution of HOOK_MEM_READ , TLB is flushed
            ((uc_cb_hookmem_t)hook->callback)(env->uc, UC_MEM_READ, addr, size, 0, hook->user_data);
    ...
    haddr = (void *)((uintptr_t)addr + entry->addend); //entry is flushed make calculation of haddr incorrect
    res = load_memop(haddr, op); //load_memop() accessing wrong address
    ...
}

Example:

//Before the hook (uc_mem_write())
addend:    0x7f1e29200000
addr:      0x100003fe
haddr:     0x7f1e392003fe

>>> Hook data read at 0x10000300

//After the hook (uc_mem_write())
addend:    0xffffffffffffffff
addr:      0x10000300
haddr:     0x100002ff

Segmentation fault

I would like some of your insights on how to fix this @wtdcode