hasherezade / pe_to_shellcode

Converts PE into a shellcode
https://www.youtube.com/watch?v=WQCiM0X11TA
BSD 2-Clause "Simplified" License
2.27k stars 423 forks source link

A little improvements (DCP) #19

Open AndyWatterman opened 3 years ago

AndyWatterman commented 3 years ago

Thanks for your amazing repo. I'm not an expert, but probably may suggest a little improvements.

  1. You have a call to VirtualAlloc in your shellcode. Since some windows processes have an option "Dynamic code prohibited" (DCP), so you can't do this. Such case may be handled if memory allocation is done by an external process. Thus external process allocate memory for shellcode+image size, then shellcode checks if allocation was unsuccessfull and futher pointing RAX to the end of shellcode, where memory for image is located.

  2. Before calling EP you are flushing instructions. Again you can't do this for the processes with DCP. Probably, you don't need this call, because your newly mapped instructions are never executed before, so they are not in the processor cache, so it is not a "self-modifying" code. Moreover, because this code is for Windows platforms, it is limited to a number of processors. And in most cases (always?) flushing will be done automatically.

With this two things it is possible to run this sc in DCP processes.

hasherezade commented 3 years ago

hi! thanks for drawing my attention to this problem. I will try to solve it whenever I get some free time to work on pe_to_shellcode again.

  1. In fact, remapping the PE from raw into virtual format in memory is somewhat redundant - the converter could just do section remapping in such a way that raw format would be the same as virtual. Example:

remapped_sections

By this way, the original PE could still be run like a normal executable, and yet loading it in memory would not require allocating additional space for remapping. By this change the injecting process will allocate the memory once, implanting there the shellcodified PE, and all what the stub has to do is just applying relocations and filling the imports.

  1. Yes you are right, although flushing the cache is recommended as a good design, it can as well be avoided.
AndyWatterman commented 3 years ago

Yeah, your idea regarding raw is equal to vs seems better if the size of the resulting image does not make any sense.

Since past ideas have not been rejected, I will try to suggest one more. Today I was trying to use pe_to_shellcode in WOW64 environment. It is also failed for the couple of reasons. First, you are using GetProcAddress of Kernel32, which may not be loaded at all. Hence, case like injecting x64 shellcode to WOW64 process is also failed. Probably, for x64 version you could try using Ntdll, then load all dependencies. As a result, in most cases it will work fine.

hasherezade commented 3 years ago

Are you sure that it failed because Kernel32 was not loaded? It seems odd... In a normal scenario, a 32 bit version of Kernel32 will be loaded in a WoW64 process... We will have 2 version of NTDLL, 32 and 64 bit, and 1 of Kernel32.

sample_list

I need more data to reproduce your specific scenario.

AndyWatterman commented 3 years ago

Hm... It crashes on "crc_outer" label while trying to xor. I guess something wrong with getting kernel32 base.

Are we are talking about injection of x64 shellcode in WOW64 process? If yes, there is no x64 version of kernel32 by default, so there is no GetProcAddress, so it is not possible to parse x64 bit kernel exports... You should load it manually (the same as on your picture - only x32 version).

Moreover, anyway you should check if kernel32 is present, since you're using GetProcAddress.

hasherezade commented 3 years ago

No, I was talking about the injection of 32 bit code into WoW64 process. This tool is intended for simple injections: 32 -> 32, and 64 -> 64, not for Heaven's Gate. Implementing the preparation of the full environment for loading 64 bit shellcodes from 32 bit process is much more elaborate, and out of scope of this small tool.

AndyWatterman commented 3 years ago

Should't be enough for x64 to use LdrLoadDll+LdrGetProcedureAddress instead of LoadLibraryA+GetProcAddress ? Okay, sometimes you need to unmap kernel32 and user32 regions, but I do not see other pitfalls..

hasherezade commented 3 years ago

Can you share with me the code of the injector that you are using? If you don't want to share here, you can send me an e-mail (hasherezade-at-pm.me). I will first try to reproduce the exact scenario that you are trying to execute, and think about the best way of dealing with it. (BTW - I am currently working on something else, so pe_to_shellcode has to wait... )

hasherezade commented 3 years ago

@crowman2 - I made some refactoring that allows to inject the code into processes with DCP enabled. Please check it out and let me know if everything works fine. Soon I will take care of the other issue (injection of x64 shellcode into WOW64 process).