danielplohmann / apiscout

This project aims at simplifying Windows API import recovery on arbitrary memory dumps
BSD 2-Clause "Simplified" License
241 stars 41 forks source link

Add output support for the call/jmp referencing addresses in JSON_FILE mode #37

Closed renzhexigua closed 2 years ago

renzhexigua commented 2 years ago

Hi Daniel,

I use this tool to patch/rebuild IAT from a mapping PE dmp file. It works very well only that I need to find all call/jmp xxxx addresses manually to patch/replace the address via LIEF libaray.

The core logic code is like:

dInfos = 
[
    {
        "offset": 25312,
        "apiAddress": 1993408960,
        "dll": "advapi32.dll(32 bit)",
        "api": "RegSetValueExA",
        "references": [ # !!! apiscout lacks this information, so I need to find these addresses manually or by script
            9257,
            9294
        ]
    },
]

# Rebuild IAT entries
for info in dInfos: 
    lib_name = info['dll'].split('(')[0]
    entry_name = info['api']
    lib = binary.get_import(lib_name)
    # create new library if not exist
    if not lib:
        lib = binary.add_library(lib_name)
    entry = lib.get_entry(entry_name)
    # add new function entry if not exist
    if not entry:
        entry = lib.add_entry(entry_name)

# Fix IAT-related call reference
imagebase = binary.optional_header.imagebase
for info in dInfos:
    lib_name = info['dll'].split('(')[0]
    entry_name = info['api']
    # iat_addr_va: LIEF reassigned address
    iat_addr_va = imagebase + binary.predict_function_rva(lib_name, entry_name)
    for insn_va in info['references']:
        # print(f'{lib_name}!{entry_name} called by 0x{insn_va:x}: (insn) call 0x{iat_addr_va:x}')
        # Actually, the mnemonics of insn calling IAT include: CALL, JMP, etc.
        # Ref: https://github.com/volatilityfoundation/volatility/blob/master/volatility/plugins/malware/impscan.py#L197-L219
        #
        # FF 15 dd cc bb aa    CALL 0xaabbccdd
        binary.patch_address(insn_va + 2, iat_addr_va, 4, binary.VA_TYPES.RVA)

As ApiScout has supplied the basic api-related context info already, why not add the xreference info as well to make it more convenient, suitable, and automated for such a scenario?

So I change it a little to fit my needs. This PR only affects the JSON's output layout, i.e., it adds an additional field named references(follow your convention, use RVA here) to holds all call/jmp instruction addresses.

python scout.py /path/to/binary /path/to/db -o iats.json

[
    {
        "offset": 25312,
        "apiAddress": 1993408960,
        "dll": "advapi32.dll(32 bit)",
        "api": "RegSetValueExA",
        "references": [
            9257,
            9294
        ]
    },
    {
        "offset": 25316,
        "apiAddress": 1993402960,
        "dll": "advapi32.dll(32 bit)",
        "api": "RegQueryValueExA",
        "references": [
            9350,
            9428
        ]
    },
   ...snip...
]

The default console layout and rendered result are consistent.

After that, we can import this JSON file directly and patch the binary data.

danielplohmann commented 2 years ago

Hey!

Thanks a lot for the PR - It's a good idea to expose this data!

I just had to fix/extend the tests to reflect this fact, which were broken by the original PR.

Note that this change is also potentially breaking as the ApiScout.crawl() result now contains an additional field. At least if someone is assigning variables directly from the tuple, which I would assume is unlikely though. I just wanted to ensure that information about this change is also properly reflected in the README.