NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.55k stars 5.87k forks source link

System32's dbgeng.dll is always used for debugging. #3019

Open jpoiret opened 3 years ago

jpoiret commented 3 years ago

Describe the bug Although I can connect fine to a WinDbg session running on the computer using another WinDbg client, Ghidra cannot do the same: the GADP window flashes for a second and disappears. Upon inspection using Process Monitor, it seems that C:\Windows\System32\dbgeng.dll is systematically getting loaded, even if I put my installed WinDbg dll's in the root directory of ghidra, or in the support directory.

To Reproduce Steps to reproduce the behavior:

  1. Use WinDbg from Windows 10 SDK (10.0.19041.0) to launch a debugging session, use .server tcp:port=5000 to start the server.
  2. In Ghidra's Connect window, select MS dbgeng.dll (WinDbg) local agent via GADP/TCP, and add tcp:server=localhost,port=5000 to DebugConnect options.
  3. GADP window flashes and disappears.

Expected behavior The local agent connects successfully to the WinDbg session.

Environment (please complete the following information):

Additional context I'm using this method to set the .createdir of the debuggee, as the session inside Ghidra seems to crash whenever I .kill and .create the process again.

d-millar commented 3 years ago

@jpoiret I think the problem you're having is that Java is loading its copy of dbgeng.dll and creating compatibility issues. Try copying that copy to dbgeng.saved and dropping the more current version of dbgeng.dll there before starting up Ghidra. I'm not sure which version of Java you're using but, for my install, this means dropping dbgeng.dll in C:\Program Files\Amazon Corretto\jdk_11.0.10_9\bin. I usually put copies of the other dll's from the SDK in there as well - dbgeng, dbgmodel, dbgcore, dbghelp, and the support directories (winext, winxp, 1392, usb, ttd...).

Let us know if this doesn't fix the problem. There are known issues for connecting using dbgmodel over GADP to a .server instance, but dbgeng "should" work.

jpoiret commented 3 years ago

So I've pinpointed the problem: since the GADP server is launched outside of the main ghidra VM, the VM arguments passed to ghidra aren't inherited by the server, especially the -Djna.library.path variable.

One fix I've found is to find out what command is used to launch the server through the task manager, add -Djna.library.path=[your-windbg-folder] and start it manually. Maybe it would be a good idea to add a parameter in the connector menu to add this easily, but I haven't found out how to do so yet, as the codebase is already pretty big and I am rusty with Java.

d-millar commented 3 years ago

@jpoiret Apologies for not getting back to you today. I think your solution may be more on point then mine. I need to test a few things out, but we are definitely interested in making the relevant fixes. They may solve an ongoing problem we've had, so pretty excited about your suggestion!

D

d-millar commented 3 years ago

I realize this is a VERY delayed response, but, after more testing, I haven't been able to get either the IN-VM or GADP versions of either the dbgeng or dbgmodel agents to pick up values for DLLs specified by either -Djna.library.path or -Djava.library.path. It appears the JDK searches the current path, System32, various other Windows locations , and then environment variables to locate anything loaded via System.loadLibrary. As a result, it's always going to choose System32 for a Windows 10 install or WinSxS for Server 2019, unless you place DLLs in the "current path". For the JDK, this is the bin directory in the JDK home. Assuming I haven't made errors in testing, this implies that changes to the load arguments for the agent won't help us out.

jpoiret commented 3 years ago

Sorry to answer you this late: I have successfully been able to use java.library.path to load the dbgeng.dll located in C:\Program Files (x86)\Windows Kits\10\Debuggers\x64 without moving them anywhere, and I checked that they were the ones being loaded using process monitor or alternatively -Djna.debug_load=true which logs the loaded DLLs in the console.

My command line is "C:\Program Files\AdoptOpenJDK\jdk-11.0.11.9-hotspot\bin\java.exe" -cp C:\all\the\classpaths -Djna.debug_load=true -Djna.library.path="C:\Program Files (x86)\Windows Kits\10\Debuggers\x64" agent.dbgeng.gadp.DbgEngGadpServer -H localhost -p 0 -r tcp:port=5000,server=localhost where I got the classpaths by looking at the command line of the debuggers (not in vm) launched by ghidra using process explorer.

Both -Djna.library.path and -Djna.debug_load must be passed directly to the java invocation for the GADP server, not to ghidra since ghidra won't pass them, and so you must launch it manually as I did above.

d-millar commented 3 years ago

@jpoiret Very interesting - thanks for the follow-up. Unfortunately, it appears your last sentence is key: "You must launch it manually as I did" We would really really like to figure out how to (a) launch the agent from within Ghidra with arguments that would override the path, and (b) launch the IN-VM version with similar arguments. Neither of those appear to be possible. We should definitely include a script to launch the agent to spare future users from the work you had to do to recover the command line, but that seems like a barely sufficient solution to the problem.

andrea-calligaris commented 2 years ago

There's something wrong about the working directory in general, because the .exe program I was trying to debug was unable to open a text file in its same folder, like if its working directory were wrong.

d-millar commented 2 years ago

@andrea-calligaris Generally, speaking the parent program will dictate the working directory, so the working directory for Ghidra will be the directory from which Ghidra was launched (the install dir or install/dir/Ghidra/Framework/Utility if you are running out of Eclipse). Anything running from the debugger will inherit that directory unless the program under test does something to cache its last working directory. Theoretically, we could set the working directory for the program being debugged, but I'm not sure that's universally desirable. In particular, I have a feeling (although I haven't verified this) that that will also modify Ghidra's concept of the current wokring directory. Also, not exactly clear what happens if you override the working directory for a program that does cache its previous location. Are you seeing something different than what I've described?

MrROBUST commented 1 year ago

I think the root of the problem is that something is changing the environment during the agent's startup.

First, we need to make sure that the path to the WinDbg bin folder is the very first entry in the system PATH variable. Because otherwise, if sysem32 is ahead, then dbgeng.dll from system32 will take precedence in resolving the path. windbg-path1

When we run ghidraRun.bat everything is fine, we have the correct PATH: windbg-path2

But as soon as we start the debugger agent (I used IN-VM), system32 appears at the beginning of the PATH variable, which changes the resolution of the library path, so it always starts the debugger library preinstalled in Windows. windbg-path3

@d-millar can this forced path change be safely removed in future releases of ghidra? I think this will solve all such problems: https://github.com/NationalSecurityAgency/ghidra/issues/3479 https://github.com/NationalSecurityAgency/ghidra/issues/4732

d-millar commented 1 year ago

@MrROBUST I have to say my understanding of what's actually going on here is, I believe, incomplete, and I have yet to find a way to conclusively verify my current theory. That said, I do not believe the PATH variable is used directly in any sense. Elements that are in play AFAICT are the Ghidra bin & bin/test directories, the JDK bin directory, the system directories, the value of java.library.path, and most importantly the value of jna.library.path. You can gain some insight into this by enabling logging using -Djna.debug_load=true and -Djna.debug_load.jna=true.

The exact order of elements considered may also depend on whether you're running Ghidra out of the distribution or from a development environment, such as Eclipse. It is also pretty clear that running the GADP versions of the debugger may result in the debugger agent and the main GUI running in different environments, which is very undesireable. (@nsadeveloper789 may have fixed this with the latest release.)

I am pretty sure the JDK bin directory takes precedence over the system32 directory, which led to my initial recommendations for folks trying to resolve issues. Lately, I have been more or less convinced that setting the jna.library.path may override everything and really may be the "correct" solution. Various documentation suggests the java.library.path is the default fallback, but I am not convinced those values have precedence over the JDK and system32 values.

In any case, this isn't really something we can fix in the release, as, per your comment, you really need the path to the latest WinDbg installation in (my guess) the first position of the value specified by -Djna.library.path, and, short of querying the user (and then setting it and forcing a restart), there isn't a trivial way to identify this.

MrROBUST commented 1 year ago

@d-millar Thank you for the answer.

My assumption that the debugger path in PATH would load the libraries was incorrect. This is because, according to the standard search order, it has the lowest search priority (12) and the system folder comes first (8). Also, it turned out that it wasn't ghidra that actually changed the PATH of the process. This happens after the first call to DebugCreate from native dbgeng.dll

In any case, the key point is JNA, which ghidra uses to access native functions from dynamic libs. The -Djna.library.path="..." flag causes JNA to directly look for DLL files in the specified directories, use the full path to the found file, and directly load the required library. Unfortunately, the debugger (IN-VM) fails to start with the message "Non-respawnable executor terminated unexpectedly". From Eclipse I see that the call to 'return Native.load("dbgeng.dll", DbgEngNative.class);' from DbgEngNative.java throws 'java.lang.UnsatisfiedLinkError: The specified procedure could not be found.' exception.

Full output (except huge lists after 'not found in resource path'): ``` Jun 22, 2023 11:16:10 AM com.sun.jna.NativeLibrary loadLibrary INFO: Looking for library 'kernel32' Jun 22, 2023 11:16:10 AM com.sun.jna.NativeLibrary loadLibrary INFO: Adding paths from jna.library.path: C:\Program Files (x86)\Windows Kits\10\Debuggers\x64 Jun 22, 2023 11:16:10 AM com.sun.jna.NativeLibrary loadLibrary INFO: Trying kernel32.dll Jun 22, 2023 11:16:10 AM com.sun.jna.NativeLibrary loadLibrary INFO: Found library 'kernel32' at kernel32.dll Jun 22, 2023 11:16:23 AM com.sun.jna.NativeLibrary loadLibrary INFO: Looking for library 'dbgeng.dll' Jun 22, 2023 11:16:23 AM com.sun.jna.NativeLibrary loadLibrary INFO: Adding paths from jna.library.path: C:\Program Files (x86)\Windows Kits\10\Debuggers\x64 Jun 22, 2023 11:16:23 AM com.sun.jna.NativeLibrary loadLibrary INFO: Trying C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\dbgeng.dll Jun 22, 2023 11:16:27 AM com.sun.jna.NativeLibrary loadLibrary INFO: Loading failed with message: The specified procedure could not be found. Jun 22, 2023 11:16:27 AM com.sun.jna.NativeLibrary loadLibrary INFO: Adding system paths: [] Jun 22, 2023 11:16:27 AM com.sun.jna.NativeLibrary loadLibrary INFO: Trying C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\dbgeng.dll Jun 22, 2023 11:16:31 AM com.sun.jna.NativeLibrary loadLibrary INFO: Loading failed with message: The specified procedure could not be found. Jun 22, 2023 11:16:31 AM com.sun.jna.NativeLibrary loadLibrary INFO: Looking for lib- prefix Jun 22, 2023 11:16:31 AM com.sun.jna.NativeLibrary loadLibrary INFO: Trying libdbgeng.dll Jun 22, 2023 11:16:32 AM com.sun.jna.NativeLibrary loadLibrary INFO: Loading failed with message: The specified module could not be found. Jun 22, 2023 11:16:32 AM com.sun.jna.Native extractFromResourcePath INFO: Looking in classpath from ghidra.GhidraClassLoader@5479e3f for dbgeng.dll Jun 22, 2023 11:16:32 AM com.sun.jna.NativeLibrary loadLibrary INFO: Loading failed with message: Native library (win32-x86-64/dbgeng.dll) not found in resource path ([file:/C:/eclipse/eclipse/configuration/org.eclipse.osgi/248/0/.cp/lib/javaagent-shaded.jar, ... ERROR Non-respawnable executor terminated unexpectedly java.lang.UnsatisfiedLinkError: Unable to load library 'dbgeng.dll': The specified procedure could not be found. The specified procedure could not be found. The specified module could not be found. Native library (win32-x86-64/dbgeng.dll) not found in resource path ([file:/C:/eclipse/eclipse/configuration/org.eclipse.osgi/248/0/.cp/lib/javaagent-shaded.jar, ... at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:302) at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:455) at com.sun.jna.Library$Handler.(Library.java:192) at com.sun.jna.Native.load(Native.java:596) at com.sun.jna.Native.load(Native.java:570) at agent.dbgeng.jna.dbgeng.DbgEngNative.loadLibs(DbgEngNative.java:44) at agent.dbgeng.jna.dbgeng.DbgEngNative.(DbgEngNative.java:35) at agent.dbgeng.dbgeng.DbgEng.debugCreate(DbgEng.java:147) at agent.dbgeng.manager.impl.DbgManagerImpl.lambda$2(DbgManagerImpl.java:473) at agent.dbgeng.gadp.impl.DbgEngClientThreadExecutor.init(DbgEngClientThreadExecutor.java:50) at agent.dbgeng.gadp.impl.AbstractClientThreadExecutor.run(AbstractClientThreadExecutor.java:115) at java.base/java.lang.Thread.run(Thread.java:833) Suppressed: java.lang.UnsatisfiedLinkError: The specified procedure could not be found. at com.sun.jna.Native.open(Native Method) at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:191) ... 11 more Suppressed: java.lang.UnsatisfiedLinkError: The specified procedure could not be found. at com.sun.jna.Native.open(Native Method) at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:204) ... 11 more Suppressed: java.lang.UnsatisfiedLinkError: The specified module could not be found. at com.sun.jna.Native.open(Native Method) at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:265) ... 11 more Suppressed: java.io.IOException: Native library (win32-x86-64/dbgeng.dll) not found in resource path ([file:/C:/eclipse/eclipse/configuration/org.eclipse.osgi/248/0/.cp/lib/javaagent-shaded.jar, ... at com.sun.jna.Native.extractFromResourcePath(Native.java:1095) at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:276) ... 11 more (AbstractClientThreadExecutor.java:154) ```

Doesn't look like the exception contains useful information. So I added the following line before loading the library Kernel32.INSTANCE.SetErrorMode(0);, and got this message box: dbgeng

It turns out that dbgend.dll depends on gbdhelp.dll from the same directory, but for some reason gbdhelp.dll from System32 is already loaded. It is used because the Standard search order uses a Loaded-module list (4). If I manually unload this gbdhelp.dll before starting the debugger, everything starts working: dbgeng2

In the case of GADP, it seems that its child process is started without the jna.library.path from the main ghidra process. Therefore, libraries are loaded from System32.

Another way, recommended for the same issue, is to copy the files from the debugger distribution to the bin directory of the JVM. It looks like a completely failsafe way, but I don't like the idea of such an invasive modification :sweat_smile:

TheWhit3F0x commented 9 months ago

Is this issue still alive? The idea of modifying the JVM bin directory seems a little invasive as @MrROBUST said. Anyways, now that WinDbg in no longer part of the SDK (you can still download the debugging tools, but the one included is an old version) you'll need to download it from Microsoft's website. As it is a Microsoft Store app now, you'll need to search the DLLs and folders inside the installation folder, usually C:\Program Files\WindowsApps\{SomeLongNameWithWindbgInIt}\{arch} and put them inside your java's installation bin directory.

Now that WinDbg is installed as a WindowsApp, maybe it could be used as a default path for Ghidra to search for the DLLs? Or at least just query the user for the directory of WinDBG's installation? It would be a better solution than dropping files in your java's installation, IMO.

d-millar commented 9 months ago

@TheWhit3F0x Sadly, still an issue…. As noted above, my understanding of this is still evolving, but will do my best to describe what I believe the issues involved are.

First, agreed regarding the undesirability of the current solution. We are working on a better Python-based solution, but it is not quite ready for consumption and its initial implementation will not include the dbgmodel-based debugger.

As I understand it, you are suggesting including code in the dbgmodel initialization module that would search for the requisite DLLs based on the default installation paths. I’m not a fan of that solution for a couple of reasons. I certainly would prefer not to hard-code paths in the Java source, as this would require a user to recompile from source if their environment didn’t match the default. You might argue this would be uncommon, but, for example, no default path would exist for any version of Windows Server, as the Microsoft Store is unavailable there.

The alternative would be to locate information about the installation external to the source, but then the user has to be aware of those files and the correct format for new entries. Possible, but not ideal either.

My main objection, however, is that I don’t think it solves the problem. Specifically, I think the real problem is that Java is pre-loading dbgeng/dbghelp before the debugger is loaded. This is certainly true if you are running out of Eclipse. Am traveling right now sans computer, but I should try running the default distribution and checking to see at what point dbgeng is loaded.

Assuming this is correct, we would have to unload the existing module, which is unlikely to be the one we want, search for the correct one, and reload it. I’m not sure that’s even possible in Java - probably is, but I don’t recall seeing an API call for that.

Obviously, the reason the current hack works is it guarantees the initial load by Java uses the version we want loaded.

TheWhit3F0x commented 9 months ago

@d-millar Thank you for that quick answer. It is nice to know that this issue is still being worked on. It definitely seems like Java is doing some unusual stuff there.

d-millar commented 9 months ago

OK, ran some tests to confirm my previous assumptions, which were not quite correct but close. Any java process running on Windows pre-loads dbghelp.dll and dbgcore.dll. At a guess, this enables debugging by provding symbols and the ability to walk the stack. So, while we can probably control the loading of dbgeng/dbgmodel via jna.library.path variable passed at startup to the JVM, we will then be running with inconsistent sets of API-MS-WIN-CORE* libraries. Am pretty sure this is not a good idea. Not obvious to me yet how java is loading those libraries, so not obvious yet if there is an override (other than the less-than-great co-location trick).

MrROBUST commented 9 months ago

Are there any plans to implement DAP in Ghidra? Maybe it makes more sense to use some native external DAP clients rather than trying to do this in Java?

d-millar commented 9 months ago

@MrROBUST We looked at DAP a while back, particularly in connection for JWDP/Java debugging support. At the time, the technology wasn't very mature, and it seemed like a pretty heavy lift and a fairly severe performance penalty vice direct JWDP access. That said, that was then, this is now, might be worth revisiting. @nsadeveloper789 has already raised the issue of a javascript implementation, but I think we'd like to complete the Python solution we're currently working on first, which addresses many of the same issues.

nsadeveloper789 commented 9 months ago

My recollection when we examined DAP is that it wasn't well suited for systems level debugging, which we're still aspiring for, eventually. I was familiar with it's support for GDB, but we already had our own connector for that, so it didn't make sense to implement DAP. That said, it's still in the realm of possibility, but we don't have plans for it currently.

The solution we're working toward now is something we've been calling "Trace RMI." Essentially, it's just a remote handle to our Trace database over protobufs/TCP. It also includes a command channel in the reverse direction. So essentially, any Python program can connect to Ghidra, populate the Debugger windows, and drive the navigation aspects (i.e., current thread, frame, target....). They can also define their command set, and mark certain standard ones (e.g., step, resume, interrupt.) We've implemented Python plugins for GDB and LLDB and a wrapper around dbgeng.dll. We're still polishing things, but we're getting close to a beta for all three supported platforms.

As far as configuring the library search path, the new system should :crossed_fingers: give you the ability to customize that. To orchestrate and launch a target, Ghidra just opens a one-off Trace RMI port and launches a shell script (or batch file on Windows). That script then does whatever it needs to launch the designated target in a debugger and connect back to Ghidra. Thus, as a user, you can easily copy and/or modify the script to adjust the library paths as needed. If you want to see what the launch script for dbgeng.dll-based targets currently looks like, see https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Debug/Debugger-agent-dbgeng/data/debugger-launchers/local-dbgeng.bat.

As far as DAP is concerned, if you, as a developer, are able to find or write a DAP client in Python, then in theory, you could write a new connector/launcher that adapts Trace RMI to DAP. Yes, it would be an adapter of an adapter, but it'd be "pure" Python and a shell script. No Java required. If you'd like to hack on it, I can tell you what plugins to enable. Most of the polishing comes down to performance improvements and configuration management, but still, everything is subject to change.