panda-re / panda

Platform for Architecture-Neutral Dynamic Analysis
https://panda.re
Other
2.48k stars 479 forks source link

PyPANDA `hook_symbol` not working occasionally #912

Closed fengjian closed 3 years ago

fengjian commented 3 years ago
@panda.hook_symbol("libc", None, procname="cat")
def hook_symbols(env, tb, h):
    procname = lookup_name(panda.current_asid(env), env)
    libname = ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

**work fine, output: cat libc malloc ...... cat libc _Exit

not working nothing to output**

lacraig2 commented 3 years ago

Hi,

I think this might be related to PR #903. It may be worth pulling the docker container again.

The following script worked for me (slight modifications from your script):

from pandare import Panda, blocking
panda = Panda(generic="x86_64")

@panda.hook_symbol("libc", None, procname="cat")
def hook_symbols(env, tb, h):
    procname = panda.get_process_name(env)
    libname = panda.ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = panda.ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

panda.queue_async(run_cmd)
panda.run()

https://asciinema.org/a/yA409wwMUOuqlguiJenfcyzqW

Please try and see if pulling the latest image fixes this.

An earlier version of your post said that it was inconsistent. I've tried this code a dozen times and cannot reproduce that behavior.

fengjian commented 3 years ago

Hi,

I think this might be related to PR #903. It may be worth pulling the docker container again.

The following script worked for me (slight modifications from your script):

from pandare import Panda, blocking
panda = Panda(generic="x86_64")

@panda.hook_symbol("libc", None, procname="cat")
def hook_symbols(env, tb, h):
    procname = panda.get_process_name(env)
    libname = panda.ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = panda.ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

panda.queue_async(run_cmd)
panda.run()

https://asciinema.org/a/yA409wwMUOuqlguiJenfcyzqW

Please try and see if pulling the latest image fixes this.

An earlier version of your post said that it was inconsistent. I've tried this code a dozen times and cannot reproduce that behavior.

Hi

docker pull pandare/pandadev

docker image ls
REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
pandare/pandadev             latest              ccb46043d811        25 hours ago        4.57GB

docker run -d  -it pandare/pandadev  /bin/bash
docker exec -it xxxxx  /bin/bash

/panda/panda/plugins/hooks/hooks.cpp in pandedev

#define LOOP_ASID_CHECK(NAME, EXPR, COMPARATOR_TO_BLOCK)\
    hook_container.asid = asid; \
    it = NAME ## _hooks[asid].lower_bound(hook_container); \
    while(it != NAME ## _hooks[asid].end() && it->addr COMPARATOR_TO_BLOCK){ \
        auto h = (hook*)&(*it); \
        if (likely(h->enabled)){ \
            if (h->asid == asid){ \
                if (h->km == MODE_ANY || (in_kernel && h->km == MODE_KERNEL_ONLY) || (!in_kernel && h->km == MODE_USER_ONLY)){ \
                    EXPR \
                    if (!h->enabled){ \
                        it = NAME ## _hooks[asid].erase(it); \
                        continue; \
                    } \
                    /*memcpy((void*)&(*it), (void*)&h, sizeof(struct hook));*/ \
                } \
            } \
        } \
        ++it; \
    }

test.py

from pandare import Panda, blocking
panda = Panda(generic="x86_64")

@panda.hook_symbol("libc", None, procname="cat")
def hook_symbols(env, tb, h):
    procname = panda.get_process_name(env)
    libname = panda.ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = panda.ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

panda.queue_async(run_cmd)
panda.run()

work fine

cat libc-2.27.so fclose
cat libc-2.27.so _IO_un_link
cat libc-2.27.so _IO_file_close_it
cat libc-2.27.so _IO_unsave_markers
cat libc-2.27.so _IO_file_close
cat libc-2.27.so __close_nocancel
cat libc-2.27.so __close_nocancel
cat libc-2.27.so __close_nocancel
cat libc-2.27.so _IO_setb
cat libc-2.27.so _IO_un_link
cat libc-2.27.so _IO_file_finish
cat libc-2.27.so _IO_default_finish
cat libc-2.27.so __cxa_finalize
cat libc-2.27.so _exit

hook working, but The _exit has 5 ????

cat libc-2.27.so _IO_default_finish
cat libc-2.27.so _IO_default_finish
cat libc-2.27.so _IO_default_finish
cat libc-2.27.so _IO_default_finish
cat libc-2.27.so __cxa_finalize
cat libc-2.27.so __cxa_finalize
cat libc-2.27.so __cxa_finalize
cat libc-2.27.so _exit
cat libc-2.27.so _exit
cat libc-2.27.so _exit
cat libc-2.27.so _exit
cat libc-2.27.so _exit
555555554000-55555555c000 r-xp 00000000 08:01 19                         /bin/cat
55555575b000-55555575c000 r--p 00007000 08:01 19                         /bin/cat
55555575c000-55555575d000 rw-p 00008000 08:01 19                         /bin/cat

not working

root@5b9d41859096:/home# python test.py
using generic x86_64
Loading libpanda from /panda/build
PANDA[core]:os_familyno=2 bits=64 os_details=ubuntu:4.15.0-72-generic-noaslr-nokaslr
[PYPANDA] Panda args: [/panda/build/x86_64-softmmu/libpanda-x86_64.so -L /panda/build/pc-bios /root/.panda/bionic-server-cloudimg-amd64-noaslr-nokaslr.qcow2 -display none -m 1024 -serial unix:/tmp/pypanda_sin2ma8af,server,nowait -monitor unix:/tmp/pypanda_mx71wf42m,server,nowait]
PANDA[core]:loading required plugin hooks
PANDA[core]:initializing hooks
PANDA[core]:loading required plugin dynamic_symbols
PANDA[core]:initializing dynamic_symbols
PANDA[core]:loading required plugin osi
PANDA[core]:initializing osi
PANDA[core]:loading required plugin osi_linux
PANDA[core]:initializing osi_linux
PANDA[osi_linux]:W> kernelinfo bytes [20-23] not read
PANDA[core]:loading required plugin syscalls2
PANDA[core]:initializing syscalls2
PANDA[syscalls2]:using profile for linux x64 64-bit
PANDA[core]:loading required plugin syscalls2
PANDA[core]:/panda/build//x86_64-softmmu/panda/plugins//panda_syscalls2.so already loaded
Linux ubuntu 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Finding cat in cat's memory map:
555555554000-55555555c000 r-xp 00000000 08:01 19                         /bin/cat
55555575b000-55555575c000 r--p 00007000 08:01 19                         /bin/cat
55555575c000-55555575d000 rw-p 00008000 08:01 19                         /bin/cat
fengjian commented 3 years ago

test3.py

from pandare import Panda, blocking
panda = Panda(generic="x86_64")

@panda.hook_symbol("libc", None, procname="cat", name="hook_symbols")
def hook_symbols(env, tb, h):
    procname = panda.get_process_name(env)
    libname = panda.ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = panda.ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@panda.cb_asid_changed
def asid_changed(cpu, old, new):
    symbol_to_hook = "malloc"
    symbol = panda.ffi.new("char[]",bytes(symbol_to_hook,"utf8"))
    section_name = panda.ffi.NULL
    obj = panda.plugins["dynamic_symbols"].resolve_symbol(cpu, new, section_name, symbol)
    if obj.address != 0:
        print(f"asid changed: {panda.get_process_name(cpu)} {panda.ffi.string(obj.name)} {panda.ffi.string(obj.section)} 0x{obj.address:x}")
    else:
        print(f"asid changed: {panda.get_process_name(cpu)} be zero")

    return 0

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

panda.queue_async(run_cmd)
panda.run()

not working,

asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: systemd-journal b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: cat b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rs:main Q:Reg b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: systemd-journal b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: gmain b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: cat b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: systemd-journal b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: cat be zero
asid changed: cat b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: bash b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: jbd2/sda1-8 b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: jbd2/sda1-8 b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: swapper/0 b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
555555554000-55555555c000 r-xp 00000000 08:01 19                         /bin/cat
55555575b000-55555575c000 r--p 00007000 08:01 19                         /bin/cat
55555575c000-55555575d000 rw-p 00008000 08:01 19                         /bin/cat

hook working, but some repeated calls

asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
cat libc-2.27.so _IO_un_link
cat libc-2.27.so _IO_file_close_it
cat libc-2.27.so _IO_unsave_markers
cat libc-2.27.so _IO_file_close
cat libc-2.27.so __close_nocancel
cat libc-2.27.so _IO_setb
cat libc-2.27.so _IO_setb
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
cat libc-2.27.so _IO_setb
cat libc-2.27.so _IO_un_link
cat libc-2.27.so _IO_file_finish
cat libc-2.27.so _IO_default_finish
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
cat libc-2.27.so __cxa_finalize
cat libc-2.27.so _exit
cat libc-2.27.so _exit
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
cat libc-2.27.so _exit
asid changed: cat b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: cat b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: rcu_sched b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
asid changed: systemd-journal b'malloc' b'ld-2.27.so' 0x7ffff7df04b0
555555554000-55555555c000 r-xp 00000000 08:01 19                         /bin/cat
55555575b000-55555575c000 r--p 00007000 08:01 19                         /bin/cat
55555575c000-55555575d000 rw-p 00008000 08:01 19                         /bin/cat
lacraig2 commented 3 years ago

I have a theory. It's that my catch after sys_execve type calls is not correctly getting the auxiliary vector. If that fails symbols wont be loaded until the next asid_change callback triggers. For something as short as cat it's liable to not be long enough to consume until there's another ASID change.

I don't have a solid theory about the repeated calls. Though I have changed the before_block_exec call to a before_tcg_codegen hook. I know that its possible that bbe could see repeated calls.

I've issued a PR #914 and have an example for it:

from pandare import Panda, blocking
panda = Panda(generic="x86_64")

@panda.hook_symbol("libc", None, procname="cat")
def hook_symbols(env, tb, h):
    procname = panda.get_process_name(env)
    libname = panda.ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = panda.ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@panda.ppp("proc_start_linux", "on_rec_auxv")
def rec_auxv(cpu, tb, auxv):
    print("got to proc_start_linux rec auxv")

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

panda.queue_async(run_cmd)
panda.run()

The idea is that if rec_auxv runs then it should run normally. And with changes to the functionality of the new plugin it should be more reliable.

fengjian commented 3 years ago

I have a theory. It's that my catch after sys_execve type calls is not correctly getting the auxiliary vector. If that fails symbols wont be loaded until the next asid_change callback triggers. For something as short as cat it's liable to not be long enough to consume until there's another ASID change.

I don't have a solid theory about the repeated calls. Though I have changed the before_block_exec call to a before_tcg_codegen hook. I know that its possible that bbe could see repeated calls.

I've issued a PR #914 and have an example for it:

from pandare import Panda, blocking
panda = Panda(generic="x86_64")

@panda.hook_symbol("libc", None, procname="cat")
def hook_symbols(env, tb, h):
    procname = panda.get_process_name(env)
    libname = panda.ffi.string(h.sym.section).decode("utf-8", 'ignore')
    symname = panda.ffi.string(h.sym.name).decode("utf-8", 'ignore')

    print(f"{procname} {libname} {symname}")

@panda.ppp("proc_start_linux", "on_rec_auxv")
def rec_auxv(cpu, tb, auxv):
    print("got to proc_start_linux rec auxv")

@blocking
def run_cmd():
    # First revert to root snapshot, then type a command via serial
    panda.revert_sync("root")
    print(panda.run_serial_cmd("uname -a"))

    print("Finding cat in cat's memory map:")
    maps = panda.run_serial_cmd("cat /proc/self/maps")
    for line in maps.split("\n"):
        if "cat" in line:
            print(line)
    panda.end_analysis()

panda.queue_async(run_cmd)
panda.run()

The idea is that if rec_auxv runs then it should run normally. And with changes to the functionality of the new plugin it should be more reliable.

Hi very nice!

I've tried the PR #914 and lastest image some times and cannot reproduce that behavior, but i think should be more test...

REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
pandare/pandadev             latest              46c555952c59        11 hours ago        4.57GB
pandare/panda                latest              f4595489ec4e        11 hours ago        1.8GB
lacraig2 commented 3 years ago

Ok. With #914 merged I'm going to close this issue. If you see more inconsistency with the new proc_start_linux plugin I'd love to hear about it. Thanks.