wbenny / injdrv

proof-of-concept Windows Driver for injecting DLL into user-mode processes using APC
MIT License
1.13k stars 278 forks source link
apc hooking injection windows-driver

injdrv

injdrv is a proof-of-concept Windows Driver for injecting DLL into user-mode processes using APC.

Motivation

Even though APCs are undocumented to decent extent, the technique of using them to inject a DLL into a user-mode process is not new and has been talked through many times. Such APC can be queued from regular user-mode process (seen in Cuckoo) as well as from kernel-mode driver (seen in Blackbone).

Despite its popularity, finding small, easy-to-understand and actually working projects demonstrating usage of this technique isn't very easy. This project tries to fill this gap.

Features

Compilation

Because DetoursNT project is attached as a git submodule, which itself carries the Detours git submodule, you must not forget to fetch them:

git clone --recurse-submodules git@github.com:wbenny/injdrv.git

After that, compile this project using Visual Studio 2017. Solution file is included. The only required dependency is WDK.

Implementation

When the driver is loaded, it'll register two callbacks:

When a new process is created, the driver allocates small structure, which will hold information relevant to the process injection, such as:

Start of a new Windows process is followed by mapping ntdll.dll into its address space and then ongoing load of DLLs from the process's import table. In case of Wow64 processes on Windows x64, the following libraries are loaded immediately after native ntdll.dll: wow64.dll, wow64cpu.dll, wow64win.dll and second (Wow64) ntdll.dll. The driver is notified about load of these DLLs and marks down this information.

When these DLLs are loaded, it is safe for the driver to queue the user-mode APC to the process, which will load our DLL into the process.

Challenges

Although such project might seem trivial to implement, there are some obstacles you might be facing along the way. Here I will try to summarize some of them:

"Thunk"-method

This method injects DLL of the same architecture as the process. This method is available on all architectures.

Injection of DLL requires a small allocation inside of the user-mode address space. This allocation holds path to the DLL to be injected and a small thunk (shellcode), which basically calls LdrLoadDll with the DLL path as a parameter. It is obvious that this memory requires PAGE_EXECUTE_READ protection, but the driver has to fill this memory somehow - and PAGE_EXECUTE_READWRITE is unacceptable security concern.

It might be tempting to use ZwAllocateVirtualMemory and ZwProtectVirtualMemory but unfortunatelly, the second function is exported only since Windows 8.1.

The solution used in this driver is to create section (ZwCreateSection), map it (ZwMapViewOfSection) with PAGE_READWRITE protection, write the data, unmap it (ZwUnmapViewOfSection) and then map it again with PAGE_EXECUTE_READ protection.

With usage of sections another problem arises. Since this driver performs injection from the image load notification callback - which is often called from the NtMapViewOfSection function - we'd be calling MapViewOfSection recursively. This wouldn't be a problem, if mapping of the section wouldn't lock the EPROCESS->AddressCreationLock. Because of that, we would end up in deadlock.

The solution used in this driver is to inject kernel-mode APC first, from which the ZwMapViewOfSection is called. This kernel-mode APC is triggered right before the kernel-to-user-mode transition, so the internal NtMapViewOfSection call won't be on the callstack anymore (and therefore, AddressCreationLock will be unlocked).

Injection of our DLL is triggered on first load of DLL which happens after all important system DLLs (mentioned above) are already loaded.

In case of native processes, the codeflow is following:

In case of Wow64 processes, the codeflow is following:


NOTE: Load of the kernel32.dll was used as an example. In fact, load of any DLL will trigger the injection. But in practice, kernel32.dll is loaded into every Windows process, even if:

  • it has no import table
  • it doesn't depend on kernel32.dll
  • it does depend only on ntdll.dll (covered in previous point, I just wanted to make that crystal-clear)
  • it is a console application

Also note that the order of loaded DLLs mentioned above might not reflect the exact order the OS is performing.

The only processes that won't be injected by this method are:

Injection of these processes is not in the scope of this project.

NOTE: On Windows 7, the Wow64 loads kernel32.dll and user32.dll (both native and Wow64) into the process. Unfortunatelly, this load is performed in the initialization of Wow64 (by wow64!ProcessInit), therefore on Windows 7 we have to wait until these DLLs are loaded as well before injecting a Wow64 process.

The injected user-mode APC is then force-delivered by calling KeTestAlertThread(UserMode). This call internally checks if any user-mode APCs are queued and if so, sets the Thread->ApcState.UserApcPending variable to TRUE. Because of this, the kernel immediately delivers this user-mode APC (by KiDeliverApc) on next transition from kernel-mode to user-mode.

If we happened to not force the delivery of the APC, the APC would've been delivered when the thread would be in the alertable state. (There are two alertable states per each thread, one for kernel-mode, one for user-mode; this paragraph is talking about Thread->Alerted[UserMode] == TRUE.) Luckily, this happens when the Windows loader in the ntdll.dll finishes its job and gives control to the application - particularly by calling NtAlertThread in the LdrpInitialize (or _LdrpInitialize) function. So even if we happened to not force the APC, our DLL would still be injected before the main execution would take place.

NOTE: This means that if we wouldn't force delivery of the APC on our own, the APC would be delivered BEFORE the main/WinMain is executed, but AFTER all TLS callbacks are executed. This is because TLS callbacks are executed also in the early process initialization stage, within the LdrpInitialize function.

This behavior is configurable in this project by the ForceUserApc variable (by default it's TRUE).

NOTE: Some badly written drivers try to inject DLL into processes by queuing APC at wrong time. For example:

  • Queuing an APC for injecting DLL that doesn't depend only on ntdll.dll right when ntdll.dll is mapped
  • Queuing an APC for injecting DLL that depends on kernel32.dll right when kernel32.dll is mapped (but not loaded!)

Such injection will actually work as long as someone won't try to forcefully deliver user-mode APCs. Because this driver triggers immediate deliver of user-mode APCs (all of them, you can't pick which should be delivered), it might happen that APC of other driver will be triggered. If such APC consisted, let's say, of calling LoadLibraryA from kernel32.dll and the kernel32.dll won't be fully loaded (just mapped), such APC would fail. And because this injection happens in early process initialization stage, this error would be considered critical and the process start would fail. Also because basically every process is being injected, if start of every process would fail, it would make the system very unusable.

The reason why our DLL is not injected immediately from the ntdll.dll image load callback is simple: the image load callback is called when the DLL is mapped into the process - and at this stage, the DLL is not fully initialized. The initialization takes place after this callback (in user-mode, obviously). If we would happen to inject LdrLoadDll call before ntdll.dll is initialized, the call would fail somewhere in that function, because some variable it relies on would not be initialized.

Injection of Wow64 processes is handled via PsWrapApcWow64Thread(&NormalContext, &NormalRoutine) call. This function essentially alters provided arguments in a way (not covered here) that KiUserApcDispatcher in native ntdll.dll is able to recognize and handle such APCs differently. Handling of such APCs is internally resolved by calling Wow64ApcRoutine (from wow64.dll). This function then emulates queuing of "32-bit APC" and resumes its execution in KiUserApcDispatcher in the Wow64 ntdll.dll.

"Thunkless"-method

This method injects x64 DLL into both x64 (native) and x86 (Wow64) processes. This method is available only on Windows x64.

Injection of x64 DLL into Wow64 processes is tricky on its own, and SentinelOne wrote an excellent 3-part blogpost series on how to achieve that:

In short, if you try to use the same approach as with "thunk"-method for injecting x64 DLL into Wow64 process, you will run into problems with Control Flow Guard on Windows 10.

The solution outlined in the SentinelOne blogpost rests in calling LdrLoadDll of x64 ntdll.dll directly from the user APC dispatcher - effectively, making NormalRoutine point to the address of the LdrLoadDll. The issue here is that PKNORMAL_ROUTINE takes only 3 parameters, while LdrLoadDll takes 4.

typedef
VOID
(NTAPI *PKNORMAL_ROUTINE) (
  _In_ PVOID NormalContext,
  _In_ PVOID SystemArgument1,
  _In_ PVOID SystemArgument2
  );

NTSTATUS
NTAPI
LdrLoadDll (
  _In_opt_ PWSTR SearchPath,
  _In_opt_ PULONG DllCharacteristics,
  _In_ PUNICODE_STRING DllName,
  _Out_ PVOID *BaseAddress
  );

Note that 4th parameter of the LdrLoadDll must point to some valid address, where the BaseAddress will be stored. The devil is always in the details - the solution takes advance of "couple of lucky coincidences":

NOTE: Not all function calls from x86 NTDLL end up in x64 NTDLL. This is because some functions are fully implemented on its own in both x86 and x64 NTDLL. This applies mainly on functions that does not require any syscall - i.e. Rtl* functions. For example, if you wanted to hook RtlDecompressBuffer in Wow64 process, hooking that function in x64 NTDLL wouldn't have any effect and such hooked function would be never called.

NOTE: Because of differences in APC-dispatching mechanism, this method is not possible to use on x86 or ARM64 Windows.

"wow64log.dll reparse"-method

This method injects native DLL into all processes. This method is available on all architectures.

When Wow64 process is starting, the wow64.dll tries to load wow64log.dll. This DLL is never present in regular Windows installation (it's probably used internally by Microsoft for debugging of the Wow64 subsystem). Therefore, load of this DLL will normally fail. This isn't problem, though, because no critical functionality of the Wow64 subsystem depends on it. If the load actually succeeds, it tries to find following exported functions in the DLL:

If one of these functions is not exported by the DLL, the DLL is immediately unloaded.

If we drop custom wow64log.dll (which exports functions mentioned above) into the %SystemRoot%\System32 directory, it gets loaded into every Wow64 process.

For more details, this method is greatly described by Walied Assar

The actual injection of Wow64 processes by injdrv is handled via redirection of wow64log.dll path to the path of our native DLL. This redirection is solved via filter driver, which registers IRP_MJ_CREATE pre-callback. When this pre-callback detects that the wow64log.dll file is being opened, it replaces the path in the FILE_OBJECT by using IoReplaceFileObjectName function and returning STATUS_REPARSE in the IO_STATUS_BLOCK. The code of the filter driver is entirely based on SimRep example found in Microsoft's WDK examples.

NOTE: Because native processes do not load wow64.dll, injdrv injects them using "thunk"-method when "wow64log.dll reparse"-method is selected.

NOTE: Because wow64.dll itself is compiled for native architecture, the wow64log.dll must be also native.

Protected processes

Injection of protected processes is simply skipped, as it triggers code-integrity errors. Such processes are detected by the PsIsProtectedProcess function. If you're curious about workaround of this issue (by temporarily unprotecting these processes), you can peek into Blackbone source code. Keep in mind that unprotecting protected processes requires manipulation with undocumented structures, which change dramatically between Windows versions.

ETW logging

Finally, as mentioned in the beginning, the injected DLL performs logging of hooked functions with ETW. Because functions such as EventRegister, EventWriteString, ... are located in the advapi32.dll, we can't use them from our NTDLL-only dependent DLL. Luckily, ETW support is hardwired in the ntdll.dll too. In fact, most of the Event* functions in the advapi32.dll are simply redirected to the EtwEvent* functions in ntdll.dll without any change to the arguments! Therefore, we can simply mock the Event* functions and just include the <evntprov.h> header:

//
// Include support for ETW logging.
// Note that following functions are mocked, because they're
// located in advapi32.dll.  Fortunatelly, advapi32.dll simply
// redirects calls to these functions to the ntdll.dll.
//

#define EventActivityIdControl  EtwEventActivityIdControl
#define EventEnabled            EtwEventEnabled
#define EventProviderEnabled    EtwEventProviderEnabled
#define EventRegister           EtwEventRegister
#define EventSetInformation     EtwEventSetInformation
#define EventUnregister         EtwEventUnregister
#define EventWrite              EtwEventWrite
#define EventWriteEndScenario   EtwEventWriteEndScenario
#define EventWriteEx            EtwEventWriteEx
#define EventWriteStartScenario EtwEventWriteStartScenario
#define EventWriteString        EtwEventWriteString
#define EventWriteTransfer      EtwEventWriteTransfer

#include <evntprov.h>

...easy, wasn't it?

Usage

Following example is performed on Windows 10 x64

Enable Test-Signing boot configuration option (note that you'll need administrative privileges to use bcdedit) and reboot the machine:

bcdedit /set testsigning on
shutdown /r /t 0

Now open administrator command line and run following command:

injldr -i

The -i option installs the driver. After the driver is installed, it waits for newly created processes. When a new process is created, it is hooked. Prepare some x86 application, for example, PuTTY and run it. With Process Explorer we can check that indeed, our x64 DLL is injected in this x86 application.

Also, immediately after injldr is started, it starts an ETW tracing session and prints out information about called hooked functions:

You can exit injldr by pressing Ctrl+C. Now you can run injldr without any parameters to just start the tracing session. If you wish to uninstall the driver, run injldr -u.

This driver by default uses following injection methods:

  • InjMethodThunk on Windows x86
  • InjMethodThunkless on Windows x64
  • InjMethodWow64LogReparse on Windows ARM64

Therefore, it always tries to inject native DLL into all processes, including Wow64 processes. If you wish to change this behavior and e.g. inject x86 DLL into x86 Wow64 process, set injection method to InjMethodThunk. Also, do not forget to compile injdll for the corresponding architectures and place it in the same directory as injldr.exe.

License

This software is open-source under the MIT license. See the LICENSE.txt file in this repository.

Dependencies are licensed by their own licenses.

If you find this project interesting, you can buy me a coffee

  BTC 3GwZMNGvLCZMi7mjL8K6iyj6qGbhkVMNMF
  LTC MQn5YC7bZd4KSsaj8snSg4TetmdKDkeCYk