m4b / goblin

An impish, cross-platform binary parsing crate, written in Rust
MIT License
1.18k stars 158 forks source link

Help obtaining bytes of a function in .TEXT #388

Closed vrmiguel closed 2 months ago

vrmiguel commented 8 months ago

Hey there @m4b @philipc !

First off, congrats on the awesome crate, it's super impressive. This issue is not a bug or a feature request, I'd like to request a bit of help with the following:

Given an ELF64 .so of a Postgres extension, I wish to find out which Postgres version the extension was built for. I know that every extension has a function called Pg_magic_func that returns a pointer to a struct (in .rodata, I assume) that contains this information.

In Rust, a Pg_magic_func would look something like:

        pub extern "C" fn Pg_magic_func() -> &'static Pg_magic_struct {
            const MY_MAGIC = Pg_magic_struct {
                len: size_of::<Pg_magic_struct>() as i32,
                version: 1500,
                ..,
            };

            &MY_MAGIC
        }

Using goblin, here's what I have to parse the ELF and obtain the relevant symbol:

fn find_postgres_version<P: AsRef<Path>>(library_path: &P) -> anyhow::Result<()> {
    let buffer = std::fs::read(library_path)?;
    let elf = match Object::parse(&buffer)? {
        Object::Elf(elf) => elf,
        Object::Unknown(magic) => {
            bail!("unknown magic: {:#x}", magic)
        }
        other => {
            bail!("Unsupported object type {other:?}, expected ELF");
        }
    };

    let magic_func = elf
        .dynsyms
        .into_iter()
        .find(|symbol| elf.dynstrtab.get_at(symbol.st_name) == Some("Pg_magic_func"))
        .with_context(|| "Failed to find Pg_magic_func")?;

    assert!(magic_func.is_function());

    let section_headers = elf.shdr_strtab.to_vec();
    let _ = dbg!(section_headers);

    let text_section = elf
        .section_headers
        .iter()
        .find(|sh| elf.shdr_strtab.get_at(sh.sh_name) == Some(".text"))
        .with_context(|| ".text section not found")?;

    let start = text_section.sh_offset + magic_func.st_value; // value is the offset in .TEXT
    let end = start + magic_func.st_size;

    let function_bytes = &buffer[start as usize..end as usize];

    print_bytes(function_bytes);

    let data = unsafe {
        jump_to_arbitrary(function_bytes)?
    };

    dbg!(data);

    Ok(())
}

However, I believe I'm struggling a bit to find the function within .TEXT (I figure it's in .text since that's what objdump tells me), since the bytes I get don't mean nothing useful once I disassembly them. Is my text_section being defined correctly?

This is my code to execute those bytes:

    use libc::{mmap, MAP_PRIVATE, PROT_READ, PROT_WRITE};

    let ptr = unsafe {
        mmap(
            // Let kernel decide where to place it in virtual address space
            core::ptr::null_mut(),
            // How much memory to be allocated, in bytes
            function_bytes.len(),
            // Allocate read/write memory
            PROT_READ | PROT_WRITE | PROT_EXEC,
            // Memory is zero-filled
            // Memory is used only by this process
            MAP_ANONYMOUS | MAP_PRIVATE,
            // The following two arguments are not used for anonymous memory
            -1,
            0,
        )
    };

    let exec_ptr = match ptr {
        libc::MAP_FAILED => bail!("mmap failed"),
        valid_ptr => valid_ptr,
    };

    unsafe {
        libc::memcpy(exec_ptr, function_bytes.as_ptr() as * const _, function_bytes.len());
    }

    let func: extern "C" fn() -> *const Pg_magic_struct = std::mem::transmute(exec_ptr);
    Ok(func())
}

The previous works if I pass in shellcode to write "Hello, world" to stdout, for example, so I'm assuming there's nothing wrong there.

Also, since this function would return me a pointer, I'm assuming I'd also have to fetch the pointer's content in the .so as well, is that correct? I haven't started doing this yet

Any help would be useful, and thanks a lot for the crate!

philipc commented 8 months ago

let start = text_section.sh_offset + magic_func.st_value;

This should probably be let start = text_section.sh_offset + (magic_func.st_value - text_section.sh_addr);.

Print out the bytes you are getting, and compare with the objdump -d output.

Also, since this function would return me a pointer, I'm assuming I'd also have to fetch the pointer's content in the .so as well, is that correct? I haven't started doing this yet

Yes, and you'd have to adjust the pointer by magic_func.st_value - exec_ptr because it'll be getting a pointer to the data using relative addressing and you're not running it at the expected address.

Personally, I would disassemble the code instead of executing it. You wouldn't even need a full disassembler, just hard code the instructions that you expect to see.