qilingframework / qiling

A True Instrumentable Binary Emulation Framework
https://qiling.io
GNU General Public License v2.0
5.07k stars 739 forks source link

Restoring a state with MMIO causes an exception #1136

Open LukeSerne opened 2 years ago

LukeSerne commented 2 years ago

Describe the bug Mapping an MMIO region, then saving the qiling state and then restoring the qiling state causes an exception. See the script below.

Sample Code

import qiling
from qiling.const import QL_ARCH

ql = qiling.Qiling(code=bytes(0x1000), archtype=QL_ARCH.ARM, ostype='linux')
ql.mem.map_mmio(addr=0, size=0x1000, read_cb=lambda *args: 0, write_cb=lambda *args: None)
qiling_state = ql.save()

# ...

ql.restore(qiling_state)

Actual behavior

$ python main.py 
Traceback (most recent call last):
  File "main.py", line 15, in <module>
    ql.restore(qiling_state)
  File "~/.local/lib/python3.8/site-packages/qiling/core.py", line 783, in restore
    self.mem.restore(saved_states["mem"])
  File "~/.local/lib/python3.8/site-packages/qiling/os/memory.py", line 275, in restore
    self.map_mmio(lbound, ubound - lbound, read_cb, write_cb, info=label)
  File "~/.local/lib/python3.8/site-packages/qiling/os/memory.py", line 547, in map_mmio
    self.ql.uc.mmio_map(addr, size, __mmio_read, read_cb, __mmio_write, write_cb)
  File "~/.local/lib/python3.8/site-packages/unicorn/unicorn.py", line 527, in mmio_map
    raise UcError(status)
unicorn.unicorn.UcError: Invalid memory mapping (UC_ERR_MAP)

Expected behavior No exception.

Additional context An exception occurs in unicorn because the MMIO memory region is mapped, while it is already mapped. I think the best way to handle this is to first unmap the existing memory region, and then map the MMIO region. I don't think simply ignoring the MMIO region if it overlaps with a region that is already mapped is possible, since the read/write handler might be different.

https://github.com/qilingframework/qiling/blob/f3e66ec290b8c7a0ee60bc2f2715ddc6e9389216/qiling/os/memory.py#L280-L299

Proposed patch

    def restore(self, mem_dict):
        """Restore saved memory content.
        """
        # clear existing memory map
        self.unmap_all()

        # restore RAM
        for lbound, ubound, perms, label, data in mem_dict['ram']:
            self.ql.log.debug(f'restoring memory range: {lbound:#08x} {ubound:#08x} {label}')

            size = ubound - lbound
            if self.is_available(lbound, size):
                self.ql.log.debug(f'mapping {lbound:#08x} {ubound:#08x}, mapsize = {size:#x}')
                self.map(lbound, size, perms, label)

            self.ql.log.debug(f'writing {len(data):#x} bytes at {lbound:#08x}')
            self.write(lbound, data)

        # restore MMIO
        for lbound, ubound, perms, label, read_cb, write_cb in mem_dict['mmio']:
            self.ql.log.debug(f"restoring mmio range: {lbound:#08x} {ubound:#08x} {label}")

            #TODO: Handle overlapped MMIO?
            self.map_mmio(lbound, ubound - lbound, read_cb, write_cb, info=label)
elicn commented 2 years ago

@wtdcode thoughts?

wtdcode commented 2 years ago

As the comment above shows, I just leave the situation to users, i.e. users are responsible for making sure there are no MMIO regions overlapped. The reason is exactly what he mentions: the handler might be different so unmapping an existing MMIO region might cause confusion, unexpected handlers and bugs hard to locate.

I personally prefer Linux ways, something like MAP_FIXED. In this case, we are "authorized" by users to unmap any existing regions, however current design doesn't have a good place for this argument and thus I left a TODO. ;p

LukeSerne commented 2 years ago

As the comment above shows, I just leave the situation to users, i.e. users are responsible for making sure there are no MMIO regions overlapped.

Not clearing the memory and leaving it to users is not a bad idea in and of itself, but if this is the approach that is taken, it should at least be documented somewhere that the user is responsible for clearing the memory before restoring saved states. Right now, users get errors that are also very hard to interpret (as I showed in my first message): unicorn.unicorn.UcError: Invalid memory mapping (UC_ERR_MAP) is very different from the cause of the error: "there still is memory mapped here".


I agree that it might be unexpected to unmap parts of the memory, but this applies to regular (not MMIO) memory as well. There, the restore function happily overwrites existing data. In fact, it will even write into unmapped memory if the user changed the memory mappings in a specific way after the save call (see script 1 below).

In addition, I think the real issue might be a bit deeper. I think the restore function should restore the memory state to exactly the way it was when save was called. With the current implementation, this is not the case. Script 2 (below) shows that the map_info is not the same at the time of save and the time of restore. In fact, any changes to memory regions that are mapped after the save (regardless of whether MMIO or not), are unaffected by the restore operation.

The proper fix (in my opinion) is to first clear the entire memory in the restore function, and only then re-map the previously mapped regions (and write the data to them). This avoids the exceptions in both scripts below, as well as the original exception in this issue. Completely resetting the memory on a restore call seems reasonable and expected from and end-user's perspective, since that way, the state after restore is exactly the same as the state at the save.


script 1:

import qiling
from qiling.const import QL_ARCH

ql = qiling.Qiling(code=bytes(0x1000), archtype=QL_ARCH.ARM, ostype='linux')
ql.mem.map(0, 0x2000)
qiling_state = ql.save()
ql.mem.unmap(0, 0x1000)
ql.restore(qiling_state)  # unmapped write error here

script 2:

import qiling
from qiling.const import QL_ARCH

ql = qiling.Qiling(code=bytes(0x1000), archtype=QL_ARCH.ARM, ostype='linux')
ql.mem.map(0, 0x1000)
qiling_state = ql.save()
ql.mem.map(0x3000, 0x1000)
ql.restore(qiling_state)
assert not ql.mem.is_mapped(0x3000, 0x1000)  # this assertion fails
wtdcode commented 2 years ago

The proper fix (in my opinion) is to first clear the entire memory in the restore function, and only then re-map the previously mapped regions (and write the data to them)

This sounds reasonable indeed, however, I can't remember the exact reason why I didn't do this at the time of writing some relevant code. I would have a look ASAP.

xwings commented 1 year ago

Close for now and this is related to #1137

LukeSerne commented 1 year ago

I don't get why this issue is closed. It is not fixed, and it's not a duplicate of #1137. #1137 is only related in the sense that both issues deal with MMIO regions. The root cause of the issues is quite different. This issue is caused by the fact that restoring a state does not clear the current state. Another symptom of this issue is demonstrated by the script below - the written value "Hello World" persists after the state restore.

import qiling
from qiling.const import QL_ARCH

ql = qiling.Qiling(code=b"", archtype=QL_ARCH.ARM, ostype='linux')
qiling_state = ql.save()

# Write "Hello World" at address 0x2000
ql.mem.map(0x2000, 0x1000)
ql.mem.write(0x2000, b"Hello World")

# Restore the state from before we wrote "Hello World"
ql.restore(qiling_state)

# We would expect the line below to raise a UC_ERR_READ_UNMAPPED error because
# in the state that we saved and just restored, the address 0x2000 is not mapped.
# Instead, the line below prints "Hello World", showing that the mapped memory
# still exists and is not restored to the qiling_state properly.
print(ql.mem.read(0x2000, 11))  # prints "bytearray(b'Hello World')"

In contrast, the root cause of #1137 is that the mmio_cbs field of ql.mem is not updated correctly when unmapping part of an MMIO region.

As such, I would reopen this issue and fix it. I even submitted a proposed patch in the initial post that fixes the root cause, so it would be great if that could be merged. As for #1137, it is a different issue that also requires a fix.

wtdcode commented 1 year ago

@LukeSerne Thanks for you report. I remember this has been fixed by @chinggg somewhere? @chinggg , could you confirm?

chinggg commented 1 year ago

@wtdcode Just tested my latest code, I find the restore error still exists since I never touched the implementation of restore.

Find another small error for ql = qiling.Qiling(code=b"", archtype=QL_ARCH.ARM, ostype='linux'), since code can be b"", all if ql.code in the loaders should be changed to if ql.code is not None