volatilityfoundation / volatility3

Volatility 3.0 development
http://volatilityfoundation.org/
Other
2.65k stars 453 forks source link

Non-canonical address masking #699

Closed paulkermann closed 1 year ago

paulkermann commented 2 years ago

Currently volatility3 returns non-canonical results when de-referencing addresses (like pointers). It means by default pointers are masked with their layer's address_mask. This makes pointer comparison harder, as it is inconsistent with the way the actual OS and hardware handles addresses.

I believe we should use complete addresses (non-masked addresses) as the conventional address format, so as not to require additional logic in address handling. It will make pointer comparison more consistent with the way it is implemented in the OS.

I'm willing to implement this solution myself as needed, looking forward to hearing your thoughts.

ikelos commented 2 years ago

Could you please expand on "so as not to require additional logic in address handling"? So far there isn't a clear concrete reason for this change and it alters very deep parts of volatility across all intel spaces (32-bit, pae and 64-bit). There needs to be extremely good reason for that, and at the moment it's not clear what additional logic is needed and in what circumstances. Please provide some documentation and a clear, concrete example of the problem this is designed to solve.

paulkermann commented 2 years ago

There are several reasons I think we should consider using canonical addresses instead: 1) More often than not, Volatility plugins need the complete canonical addresses. When the crash dump writer has to write the KDBG address, for example, the code says dump_header.KdDebuggerDataBlock.write(kdbg.vol.offset | (0 if not is_64_bit else 0xFFFF000000000000)), because a dump file with a masked pointer in the header would be invalid. That's additional code you had to add because addresses are not canonical by default. Another relevant example is in the KDBG decryption: swap_xor = datablockencoded.vol.offset | 0xFFFF000000000000, and the _decode_pointer function in handles.py, where a commented assertion of a canonical exists to or it if needed. 1) It's how the hardware works: in AMD64 translating a non-canonical address causes a GP fault. Masked addresses are incompatible with the way the hardware uses addresses. Additionally, asserting that addresses are canonical could also improve structures found via scanning which should include valid pointers. 1) To be honest, I actually can't think of a reason a plugin would rather have a masked address and not the actual canonical one. Essentially every pointer in an Intel architecture is now potentially an invalid address forced into the address space of the current layer. Masking every pointer and every address is unnecessary logic which makes using the addresses later a pain and sometimes incorrect.

I think keeping addresses the same, and instead verifying their canonisation is just the way Intel layer should work.

paulkermann commented 2 years ago

As the #702 PR was closed, let's continue the discussion here (linking some related issues: #731 #739)

I'd like to offer another use case in which non-canonical addresses are problematic: emulation using Unicorn. A kernel function I emulated retrieves function pointers relative to the current RIP. The addresses that were retrieved became non-canonical too, which made the emulated verification of those addresses (via CFG bitmap) fail - the addresses were invalid. I had to implement additional logic to make sure addresses were in their canonical form when using emulation.

I think that fix should be framework-wide, and not implemented on a plugin-by-plugin basis. Some additional reasons why working with canonical addresses is the right move: 1) Other memory forensics and debugging tools work with canonical addresses (linux's crash, windbg and more). 1) Verifying addresses canonization could reveal bugs that went "under the hood" because that information. 1) Some programs store information in the "canonical bits", because those bits aren't used in address translation. By masking those out we could be filtering out extra information.

IridiumXOR commented 1 year ago

This problem affect also the use of Volatility 3 in tools like PANDA.

ikelos commented 1 year ago

There's now a pull request to with functions to support converting between canonical addresses, and the contiguous address spaces that volatility layers inhabit. This is #905 and contains two function for converting canonicalizing and decanonicalizing addresses. Hopefully this is a reasonable compromise, and it doesn't preclude us altering the layer in the future, but it also doesn't change the current mechanism drastically, but just adds support for specific use-cases.

paulkermann commented 1 year ago

I think that linking #905 to this issue is not the right thing (IE don't make it close this issue). It's true that this is a move in the right direction. However the main problem still remains: we will still work with invalid addresses.

702 yielded 2 bugfixes in the codebase and, if merged could drastically improve the way we scan or verify structs found in memory.

ikelos commented 1 year ago

905 provides helpers that improve the situation significantly and makes the likelihood of deep changes within the intel layer even less likely. #702 has been closed, see this comment for the reasons why.

This issue is about pointer comparisons, and returning canonicalized addresses for comparison. Volatility (version 2 if not version 3) has been worked on since 64-bit systems were just being introduced, and along the way we encountered bugs in our plugins where pointer comparisons failed because some addresses were canonical and others were not. The design decision was therefore made to include pointer masking to ensure all pointers fit within their appropriate address spaces. Pointer masking has been present in volatility 3 since roughly 2013.

Canonicalization was later standardized across intel architectures, but the volatility pointer masking mechanism is suitable for the issue at hand. Canonicalization requires dealing with one way or another whenever there are cases where numbers are taken and used as pointers. Some of those will be canonical, some will not, you therefore always need to apply some process to numbers that are being compared to pointers. The expectation is that a pointer object will be constructed which will take care of this issue. In those instances where constructing a pointer would be overkill #905 provides helper functions to convert back and forth. It still requires you to know whether the addresses are canonical or not.

This is how volatility operates, and whether you agree or disagree with the solution, changing it will impact other people who've come to rely on it. We therefore devised #905 to attempt to help with the situation without impacting existing users.

Please accept that without quantifiable evidence that your solution is significantly better (at least performance-wise) it will not be accepted. Also be aware that every time you raise the issue and require another hour to be drained into replying to your request, in order to explain to you that this issue is not one we're intending to pursue, you reduce the chances of convincing us to contribute even more of our time to alter the operation of the tool to one that you find more acceptable but have been unable to convince us is necessary.

I do think #905 will offer a sufficient solution to this issue, and am linking it to this issue again. When it is merged, this issue will also be resolved.