stevemk14ebr / PolyHook_2_0

C++20, x86/x64 Hooking Libary v2.0
MIT License
1.6k stars 226 forks source link

MemoryProtector on pointers created by new() #206

Open enginelesscc opened 3 months ago

enginelesscc commented 3 months ago

Hi, this screams error:

https://github.com/stevemk14ebr/PolyHook_2_0/blob/f4dbf7b79762e80d464f377654fdb708ebdb9ea1/sources/x64Detour.cpp#L721 https://github.com/stevemk14ebr/PolyHook_2_0/blob/f4dbf7b79762e80d464f377654fdb708ebdb9ea1/sources/x64Detour.cpp#L776

Similarly on x86:

https://github.com/stevemk14ebr/PolyHook_2_0/blob/f4dbf7b79762e80d464f377654fdb708ebdb9ea1/sources/x86Detour.cpp#L124 https://github.com/stevemk14ebr/PolyHook_2_0/blob/f4dbf7b79762e80d464f377654fdb708ebdb9ea1/sources/x86Detour.cpp#L132

We are changing protection on memory we dont own. This has unwanted side effects because it shares pages with other allocs, and may race with other threads or other detours that are done at the same time.

Solution to this is to replace any memory new/delete that has its protection mutated with VirtualAlloc/VirtualFree, given those require a full page at minimum, giving us absolute ownership. At that point we dont need MemoryProtector there anymore, which also fixes the trampoline losing execute permission at the end of scope

And given these are full pages, they should be cached/reused if possible so not every trampoline requires its own page.

stevemk14ebr commented 3 months ago

Trampolines will not lose permissions at end of scope as the last parameter to mem protector is set false. You are right though, we should not be stomping permissions of memory we don't own. Using virtualalloc as you suggest would commit one page per hook which is too large. We have a block allocator I think that x64 detour uses already somewhere, we can use this.

Thanks for pointing this out, it's a good fix

enginelesscc commented 3 months ago

Speaking of, why aren't we using asmjit's buffers? one static asmjit::JitRuntime asmjit_runtime; used for all trampolines

Rough examples:

https://github.com/EngineLessCC/polyhook2/blob/master/polyhook2/Detour/ADetour.hpp#L28 https://github.com/EngineLessCC/polyhook2/blob/master/sources/x86Detour.cpp#L136 https://github.com/EngineLessCC/polyhook2/blob/master/sources/x86Detour.cpp#L144 https://github.com/EngineLessCC/polyhook2/blob/master/sources/x64Detour.cpp#L245

But, if possible, why not replace the manual assembly with asmjit directly? :)

stevemk14ebr commented 3 months ago

This code was written before asmjit was a dependency (it was added for ILCallback). The majority of the trampoline is overwritten code copied to the trampoline from the original function. In this case we read existing instructions into an instruction object, relocate them, then write these directly to the trampoline. We could have done this with asmjit but I wanted to change the instructions as little as possible for reliability and correctness reasons and just do the relocation gymnastics. Asmjit would have involved more re-writing.

I'd prefer fixing this by sticking to the allocation issue. new is not good enough here, I agree with your original comment. We can use the FBAllocator. Reserve a large block with virtualalloc and set permissions X then the FBAllocator can split that up into smaller trampolines.