ronaldoussoren / macholib

macholib can be used to analyze and edit Mach-O headers, the executable format used by Mac OS X. It's typically used as a dependency analysis tool, and also to rewrite dylib references in Mach-O headers to be @executable_path relative. Though this tool targets a platform specific file format, it is pure python code that is platform and endian independent.
MIT License
91 stars 19 forks source link

Add registers to macholib.mach_o.thread_command #12

Open ronaldoussoren opened 10 years ago

ronaldoussoren commented 10 years ago

Original report by Asger Hautop Drewsen (Bitbucket: tyilo, GitHub: tyilo).


Currently you have to parse macholib.mach_o.thread_command yourself.

Then instead of doing:

#!python

rip_offset = 2 * 4 + 16 * 8
rip_value = struct.unpack('<Q', thread_command[2][rip_offset:rip_offset+8])[0]

you could do:

#!python

rip_value = thread_command[1].rip
ronaldoussoren commented 10 years ago

Original comment by Ronald Oussoren (Bitbucket: ronaldoussoren, GitHub: ronaldoussoren).


Do you have a patch for this?

And if not, do you have a reference to documentation where the structure of thread_command is described?

ronaldoussoren commented 10 years ago

Original comment by Asger Hautop Drewsen (Bitbucket: tyilo, GitHub: tyilo).


I don't have a patch, but you could take a look at MachOView's implementation: https://github.com/gdbinit/MachOView/blob/9863d6ef744a73a54f05eaa0323adc8707c10bf5/LoadCommands.mm#L885

ronaldoussoren commented 10 years ago

Original comment by Asger Hautop Drewsen (Bitbucket: tyilo, GitHub: tyilo).


The problem with implementing this is that the registers depends on the flavor field of the load command, so you can't just add it to the _fields_ tuple.

Here is some code that extracts the registers:

#!python

# Change this in mach_o.py
class thread_command(Structure):
    _fields_ = (
        ('flavor', p_uint32),
        ('count', p_uint32)
    )
#!python
import macholib.MachO

def get_registers(cmd_tuple):
    lc, cmd, data = cmd_tuple

    x86_THREAD_STATE32 = 0x1
    x86_THREAD_STATE64 = 0x4

    flavor = int(cmd.flavor)

    if flavor == x86_THREAD_STATE32:
        register_names = ['eax', 'ebx', 'ecx', 'edx', 'edi', 'esi', 'ebp', 'esp', 'ss', 'eflags', 'eip', 'cs', 'ds', 'es', 'fs', 'gs']
        register_type = macholib.ptypes.p_uint32
    elif flavor == x86_THREAD_STATE64:
        register_names = ['rip', 'rbx', 'rcx', 'rdx', 'rdi', 'rsi', 'rbp', 'rsp', 'r8', 'r9', 'r10', 'r11', 'r12', 'r13', 'r14', 'r15', 'rip', 'rflags', 'cs', 'fs', 'gs']
        register_type = macholib.ptypes.p_uint64
    else:
        return None

    register_size = macholib.ptypes.sizeof(register_type)
    expected_data_size = register_size * len(register_names)

    assert int(cmd.count) * 4 == expected_data_size, 'Flavor doesn\'t match count'
    assert len(data) == expected_data_size, 'Count doesn\'t match length data'

    registers = {}

    for offset, name in zip(range(0, len(data), register_size), register_names):
        registers[name] = register_type.from_str(data[offset:offset + register_size]) # Doesn't work currently: _endian_=cmd._endian_

    return registers
ronaldoussoren commented 10 years ago

Original comment by Asger Hautop Drewsen (Bitbucket: tyilo, GitHub: tyilo).


The reference for the fields in the thread command is struct x86_state_hdr from <mach/i386/thread_status.h>.

For the list of registers they can be found in <mach/i386/_structs.h> as _STRUCT_X86_THREAD_STATE64 and _STRUCT_X86_THREAD_STATE64.