mahaloz / decomp2dbg

A plugin to introduce interactive symbols into your debugger from your decompiler
BSD 2-Clause "Simplified" License
621 stars 39 forks source link

Error "Unable to find the text segment base addr" if space present in binary path #69

Closed Angelo942 closed 11 months ago

Angelo942 commented 1 year ago

Hi, I played with this error for the past two days and pinpointed it to when the local binary passed to gdb contains a space in it's absolute path.

This happens in native gdb and with pwndbg too, but not GEF. Unfortunately GEF as a similar problem when connecting to a gdbserver (https://github.com/hugsy/gef/issues/901#issuecomment-1357912965) so I have to work around it anyway.

I know that gdb has always had troubles with paths containing spaces, but it's the first time after a year working with this setup that I notice one so I don't know if the problems lies in gdb or this library.

mahaloz commented 1 year ago

Hi @Angelo942, to clarify, you are running a local binary (not remote), and you are getting this error if a space is in the binary path?

mahaloz commented 1 year ago

My immediate hunch is that this line is the cause of this issue. If you wanted to debug it before I can get to it (which will likely be this weekend), you can set a breakpoint here and see what's going on: https://github.com/mahaloz/decomp2dbg/blob/1073f2234d8317b86846b364884ed6e3c32d2006/decomp2dbg/clients/gdb/utils.py#L83

This is if you are running locally.

Angelo942 commented 1 year ago

I tested it both debugging a local process and a remote one. The bug is present every time we pass a copy of the binary to gdb.

so gdb ./challenge will never work even if we connect to a gdbserver, but for remote processes just gdb will work fine.

I took a look at the code and we have a problem here, where the path_name used is truncated: https://github.com/mahaloz/decomp2dbg/blob/1073f2234d8317b86846b364884ed6e3c32d2006/decomp2dbg/clients/gdb/utils.py#L60

I tried changing the parse to:

def vmmap_base_addrs():
    addr_maps = {}
    mappings = gdb.execute("info proc mappings", to_string=True).split("\n")
    for mapping in mappings:    
        try:
            addr = int(re.findall(r"0x[0-9a-fA-F]+", mapping)[0], 16)
            path = "/" + "/".join(mapping.split("/")[1:])
        except IndexError:
            continue

        # always use the lowest addr
        if path in addr_maps or path == "/":
            continue

        if addr and path:
            addr_maps[path] = addr

    return addr_maps

This solves the problem for local processes, but not for remote ones. (I focus on that part because gdb ./challenge is how pwntools' gdb.debug launches gdb before connecting to the gdbserver)

https://github.com/mahaloz/decomp2dbg/blob/1073f2234d8317b86846b364884ed6e3c32d2006/decomp2dbg/clients/gdb/utils.py#L91

Now I'm trying to figure out why the hash is right with gdb, but not with gdb ./challenge. The only think I noticed so far is that is that in the first case the hash changes every time, while in the second both hashes [other_file_hash and binary_hash] are constant but different. I tried with a path that doesn't contains any spaces and in that case the hashes are all constant.

mahaloz commented 1 year ago

Linking this GEF issue which may fix some stuff for us: https://github.com/hugsy/gef/pull/999

Grazfather commented 1 year ago

This one too https://github.com/hugsy/gef/pull/998