NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.43k stars 5.76k forks source link

Scripts from external OSGI bundle don't appear in Script Manager #6087

Open smx-smx opened 8 months ago

smx-smx commented 8 months ago

Describe the bug I created the GhidraScriptProject maven-archetype template (https://github.com/smx-smx/GhidraScriptProject) in order to create Ghidra Scripts in other JVM languages (Kotlin, Groovy, Scala, etc...) and make it possible to use external libraries from Maven.

The output is an OSGI bundle that can be loaded in Ghidra from the Script Directories page.

The problem is that the scripts don't show in the Script Manager once the JAR has been loaded. My assumption is that Ghidra expects them to be in the top-level/default package (like with autogenerated OSGI bundles created from script directories), and will therefore not find any loadable script.

Trying to use the top-level package will result in the following bundler error, since the "." package is not allowed:

[INFO] --- bundle:5.1.4:manifest (bundle-manifest) @ ProjectName ---
[ERROR] Manifest com.smx:ProjectName:jar:1.0-SNAPSHOT : The default package '.' is not permitted by the Import-Package syntax.
 This can be caused by compile errors in Eclipse because Eclipse creates
valid class files regardless of compile errors.
The following package(s) import from the default package null
[ERROR] Error(s) found in manifest configuration

As a workaround, i developed the following custom Ghidra Script: https://github.com/smx-smx/GhidraScriptProject/blob/d32f36297b6f9fc928318ccb01d79ad16cb1c3cc/src/main/java/InvokeBundleScript.java which will probe all loaded bundles for any GhidraScript class, recursively. It would be ideal if this feature could be provided by Ghidra itself, without needing an extra "loader" script.

To Reproduce Steps to reproduce the behavior:

install the template
  1. git clone https://github.com/smx-smx/GhidraScriptProject.git
  2. mvn install
    generate the project
  3. gen-sample.bat
  4. cd out\sample
  5. mvn package
    load the bundle
  6. Open Ghidra -> Script Manager -> Script Directories -> add out\sample\target\sample-1.0-SNAPSHOT-jar-with-dependencies.jar
  7. Observe the script is not present in Script Manager
    workaround
  8. Run the InvokeBundleScript.java script, which shows the script

picture

Expected behavior The scripts provided by the bundle should be shown in the ScriptManager. Am i perhaps doing something wrong in the bundle manifest?

Environment (please complete the following information):

Thanks

smx-smx commented 8 months ago

I just tried with maven-shade-plugin and providing the Manifest entries myself, bypassing the check done by maven-bundle-plugin (in maven-shade branch of GhidraScriptProject) Afterwards, i moved the script classes to the top level package.

Despite this, the outcome is the same (no scripts appear in Script Manager)

I then attached a debugger to Ghidra, and i noticed the following: https://github.com/NationalSecurityAgency/ghidra/blob/61f84adc3a54ce19ac946c96c1027f0066078acd/Ghidra/Features/Base/src/main/java/ghidra/app/plugin/core/script/GhidraScriptComponentProvider.java#L1203-L1208

Refresh is called/implemented only for GhidraSourceBundle objects, so nothing happens since we have a GhidraJarBundle in this case

ryanmkurtz commented 8 months ago

I'm going through your steps to reproduce but I get the following errors on mvn install (on Linux):

[INFO] Scanning for projects...
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/3.1.1/archetype-packaging-3.1.1.pom
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/3.1.1/archetype-packaging-3.1.1.pom (1.4 kB at 3.3 kB/s)
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/maven/archetype/maven-archetype/3.1.1/maven-archetype-3.1.1.pom
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/maven/archetype/maven-archetype/3.1.1/maven-archetype-3.1.1.pom (12 kB at 356 kB/s)
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/33/maven-parent-33.pom
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/maven/maven-parent/33/maven-parent-33.pom (44 kB at 920 kB/s)
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/apache/21/apache-21.pom
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/apache/21/apache-21.pom (17 kB at 1.4 MB/s)
Downloading from central: https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/3.1.1/archetype-packaging-3.1.1.jar
Downloaded from central: https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/3.1.1/archetype-packaging-3.1.1.jar (7.5 kB at 938 kB/s)
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 53, column 21
[ERROR] 'dependencies.dependency.systemPath' for ghidra:base:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Features/Base/lib/Base.jar @ line 86, column 25
[ERROR] 'dependencies.dependency.systemPath' for ghidra:software-modelling:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/SoftwareModeling/lib/SoftwareModeling.jar @ line 93, column 25
[ERROR] 'dependencies.dependency.systemPath' for ghidra:generic:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Generic/lib/Generic.jar @ line 100, column 25
[ERROR] 'dependencies.dependency.systemPath' for ghidra:project:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Project/lib/Project.jar @ line 107, column 25
[ERROR] 'dependencies.dependency.systemPath' for ghidra:utility:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Utility/lib/Utility.jar @ line 114, column 25
[ERROR] 'dependencies.dependency.systemPath' for ghidra:docking:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Docking/lib/Docking.jar @ line 121, column 25
[ERROR] 'dependencies.dependency.systemPath' for ghidra:decompiler:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Features/Decompiler/lib/Decompiler.jar @ line 128, column 25
 @ 
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]   
[ERROR]   The project com.smx:ghidra-script:1.0-SNAPSHOT (/home/ryan/git/GhidraScriptProject/pom.xml) has 7 errors
[ERROR]     'dependencies.dependency.systemPath' for ghidra:base:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Features/Base/lib/Base.jar @ line 86, column 25
[ERROR]     'dependencies.dependency.systemPath' for ghidra:software-modelling:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/SoftwareModeling/lib/SoftwareModeling.jar @ line 93, column 25
[ERROR]     'dependencies.dependency.systemPath' for ghidra:generic:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Generic/lib/Generic.jar @ line 100, column 25
[ERROR]     'dependencies.dependency.systemPath' for ghidra:project:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Project/lib/Project.jar @ line 107, column 25
[ERROR]     'dependencies.dependency.systemPath' for ghidra:utility:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Utility/lib/Utility.jar @ line 114, column 25
[ERROR]     'dependencies.dependency.systemPath' for ghidra:docking:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Framework/Docking/lib/Docking.jar @ line 121, column 25
[ERROR]     'dependencies.dependency.systemPath' for ghidra:decompiler:jar must specify an absolute path but is G:/ghidra_11.0_PUBLIC/Ghidra/Features/Decompiler/lib/Decompiler.jar @ line 128, column 25
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
smx-smx commented 8 months ago

Sorry, i forgot an hardcoded Ghidra path in pom.xml. I've now added a github actions workflow, rewritten the README, and added both sample.bat and sample.sh scripts.

Edit the variables in sample.sh, especially GHIDRA_PATH, and try again

ryanmkurtz commented 8 months ago

Thanks, I got it up and running.

Refresh is called/implemented only for GhidraSourceBundle objects, so nothing happens since we have a GhidraJarBundle in this case

This refresh() method is the same one that is called when you click the refresh button in the Script Manager. I assume clicking that button doesn't make the scripts appear for you right?

Just a heads up, if you open an empty CodeBrowser with no program open, your popup window throws lots of exceptions:

Null program passed to ProgramLocation
java.lang.NullPointerException: Null program passed to ProgramLocation
    at ghidra.program.util.ProgramLocation.<init>(ProgramLocation.java:72)
    at ghidra.program.util.ProgramLocation.<init>(ProgramLocation.java:128)
    at ghidra.util.table.AddressBasedTableModel.getProgramLocation(AddressBasedTableModel.java:82)
    at ghidra.util.table.GhidraTable.navigate(GhidraTable.java:256)
    at ghidra.util.table.GhidraTable.navigateOnCurrentSelection(GhidraTable.java:289)
    at ghidra.util.table.GhidraTable$SelectionListener.valueChanged(GhidraTable.java:361)
        ....
ryanmkurtz commented 8 months ago

Just to clarify some aspects of this ticket...you aren't reporting a bug correct? You are just requesting that we support GhidraScripts being found in non-default packages?

smx-smx commented 8 months ago

This refresh() method is the same one that is called when you click the refresh button in the Script Manager. I assume clicking that button doesn't make the scripts appear for you right?

Correct, it doesn't work.

Just a heads up, if you open an empty CodeBrowser with no program open, your popup window throws lots of exceptions:

Thanks for reporting it. It looks like createTableChooserDialog doesn't work without a valid Program object (is this expected? bug?) I did a (quite dirty) workaround by providing a dummy Program object like this (you'll find the updated version in the repo)

        var program = new DummyProgram();
        var dlg = new TableChooserDialog(state.getTool(), executor, program, "Choose a script", (Navigatable)null, false);

Full workaround in https://github.com/smx-smx/GhidraScriptProject/commit/fb1aa60197fc63fd4fa937c20cf8021a504bd5eb#diff-8e170c9ad7383270ea4da18f75fdf188bf5062d2ad564f942334a006677566fc

Just to clarify some aspects of this ticket...you aren't reporting a bug correct? You are just requesting that we support GhidraScripts being found in non-default packages?

It depends on what the expected behavior of Ghidra is, when loading a JAR bundle. From a "user story" perspective, i was expecting Ghidra to discover the scripts from the loaded JAR, but it's not doing that (regardless of which package is used, even the top-level package doesn't work)

It appears to be a missing implementation in the refresh method, which is only implemented for Script directories and is a no-op for other type of loadable files.

dragonmacher commented 8 months ago

For some context, the original author of the Osgi script bundle code is no longer on the project. I quickly scanned the code and here is what I noticed:

This last point seems like the mistake (as does the check in the refresh() method). This is the relevant code from ScriptList:

    List<ResourceFile> getScriptDirectories() {
        return bundleHost.getGhidraBundles()
                .stream()
                .filter(GhidraSourceBundle.class::isInstance)
                .filter(GhidraBundle::isEnabled)
                .map(GhidraBundle::getFile)
                .collect(Collectors.toList());
    }

It seems like we should be able to remove these checks for GhidraSourceBundle. It is not clear why it was coded this way. It could be for performance reasons, but I suspect it is there to work around some other issue.

jpleasu commented 8 months ago

For some context, the original author of the Osgi script bundle code is no longer on the project.

That'd be me. šŸ‘‹šŸ»

I agree that it's a discussion of user stories. We'd discussed related ideas when this work was done, but I'm not sure we covered the particular gap @smx-smx has fallen into. I can at least try to shed some light on the history.

There are two parts as I see it:

  1. precompiled dependency management for scripts (or lack thereof) and
  2. "script versus plugin"

The OSGi integration allows adding precompiled external dependencies to scripts (via // @import_package).. but dependencies need to be OSGi bundles. Although many common jars (e.g maven artifacts) are already bundles (have the right metadata in their manifest), not all of them are. It's pretty klunky, but there is a kung-fu for repackaging recursively resolved dependencies with maven or gradle into a single OSGi bundle - or directly with bndtools.

There was nuance to it that I can't relay, but I recall a strong argument that a "precompiled script" is just a plugin that implements a single action. The ability to "hack on" the source, unload, and reload is not shared by plugins, that's the defining feature of a script. We talked about bridging that gap - plugins as OSGi bundles, scripts as dynamic extensions, .. but there was no priority for the amount of work to implement any of the ideas we came up with.

It seems like we should be able to remove these checks for GhidraSourceBundle. It is not clear why it was coded this way. It could be for performance reasons, but I suspect it is there to work around some other issue.

I'm guessing it has to do with enumerating scripts (the *.java files in bundle directories) and change detection in order to trigger compilation.

It would be cool to open up the scripting internals (e.g. change detection, compilation to an OSGi bundle, hotkey binding) so that other JVM languages could benefit from the same "hackability" as Java ghidra scripts.

smx-smx commented 8 months ago

I got it working, sort of. I uploaded the preliminary Ghidra patch/PoC here: https://github.com/smx-smx/ghidra/commit/cb24ae757bf5fb656fa22233222503382aa38ea1 I basically created a new Script Provider to handle ".jar" files, and removed the checks mentioned by @dragonmacher .

With those changes, jars now appear in the Script Manager and can be invoked

img01

It's however not ideal: all classes are loaded and scanned, and currently only the first matching script class is considered. Plus, there are some things that don't quite make sense in the scope of the Script Manager, like:

Another idea: perhaps a custom Manifest entry could be used to specify the script classes, so that it's not necessary to scan every single class looking for classes that derive from GhidraScript.

Could this be a workable idea? It's more like a workaround, but doing it properly would require a more important effort as mentioned by @jpleasu

precompiled dependency management for scripts (or lack thereof) and

Another thing about dependencies: two precompiled scripts could use a different version of say the same dependency, e.g. the Kotlin standard library, so I'm currently using maven-shade-plugin (to embed the dependencies) instead of making an OSGI bundle out of it (the expected way when working with OSGI). I also tried Embed-Dependency from maven-bundle-plugin, but it makes the build much slower so i opted for shading as a workaround. This is on the flip side convenient, because you get a single self-contained bundle and you don't need to deal with multiple versions of the same dependency (at the expense of bundle size).

dragonmacher commented 8 months ago

@smx-smx That's pretty amazing that you got something working so quickly.

After reading through this, I had some thoughts on what better feature support would look like.

Fully Modular Jar Bundles

To have full jar bundle support it seems that clients should be able to build a bundle without including dependencies, while specifying those dependencies as part of their bundle (I'm assuming this is part of the OSGI framework). This can lead to potential classpath conflicts for same-named class and package combinations, such as in different versions of a library. Again, I'm assuming this something that the OSGI framework handles correctly.

One downside of this approach is the added complexity of clients having to understand the OSGI model and how to correctly build using it. Also, clients need to understand how to update their Ghidra installation with these external dependencies.

Simplified Jar Bundles

The approach you followed seems preferable, as it seems simpler to me to thing of each jar bundle as a discrete self-contained blob that works without any external dependency management.

The obvious downside of this approach, as you mentioned, is that this model fails under the weight of size if clients install many jar bundles all of which share the same dependencies.


I'm sure that if we can correctly use the OSGI model, then both approaches listed above will work as expected, at least in terms bundling non-editable scripts (more on that below). This gives the clients full flexibility in deciding how to use the jar bundles. Since I am not familiar with how Ghidra integrates the OSGI framework, I have no sense of how much effort is required to use these features of OSGI.


Extensions, Bundles, Plugins, Scripts and ExtensionPoints

As mentioned by @jpleasu , this discussion is quite philosophical and points out some parts of the Ghidra plugin environment that are unrefined. In addition to script bundles, we have similar considerations when dealing with Ghidra Extensions, which currently are zip files that contain fully-built Ghidra modules (which may contain plugins, scripts and ExtensionPoints). At the risk of muddy the waters, after working in Extensions framework recently, I feel that ultimately Extensions should be updated to use the OSGI model, which keeps them consistent with viewing all Ghidra plugable items as units that all use one common framework. I think there should be one general method for adding functionality to Ghidra which can contain scripts, plugins or ExtensionPoints.

The philosophical part, as mentioned by @jpleasu , relates to the fact that pre-compiled code inside of an artifact is not editable by the user. We made a decision that Plugins are complicated enough that there is no expectation of being able to change them while Ghidra is running--if you would like to change a plugin, then you need access to the source code and a development environment. Scripts on the other hand, we decided, should be small, easily changeable snippets of code that can be modified while Ghidra is running. Naturally, those are 2 ends of one spectrum, which overlap with overly complicated scripts and very simple plugins. This distinction we made has helped us frame how we support these concepts.

Regarding the other issues you mentioned:

Possible Compromise (for scripts)

As far as scripts go, I can envision supporting jar bundle authors adding the script source code to the bundle. This would address the above issues of documentation and editing. Supporting this adds complexities for both bundle authors and Ghidra maintainers. While technically this is a simple concept, the support burden has a long-tail. In deciding if supporting this concept is worth the effort, it is really helpful to have a sense of how many clients would make use of this feature. It may be good enough to support non-editable bundles, with the expectation that users will write their own scripts when editing is needed. This leads to my next point...

@jpleasu also pointed out how much guesswork has gone into deciding how to support these concepts, due primarily to a lack of user stories and general feedback. This really is the point of this long-winded post, to explain how/why things work as they do now and to solicit feedback about future needs.

smx-smx commented 8 months ago

I made some further modifications to probe for multiple script classes within the same Bundle: https://github.com/smx-smx/ghidra/commit/5b6aa009352b1928a716cf3fe301662ff5182adb

img01

I also found a neat trick to support both OSGI and bundled/embedded dependencies: https://github.com/smx-smx/GhidraScriptProject/commit/862e16bd0de4a28d420999963e249bef59fab80f

Basically, by using the following syntax in maven-bundle-plugin:

<Import-Package>*;resolution:=optional</Import-Package>

the dependencies will be looked up in OSGI modules (as normally done for Import-Package, which are autogenerated by the bundler in this case based on used Java packages).

Normally, if any imported package is missing, a fatal error would be generated and loading of the bundle would stop. However, by specifying ;resolution:=optional, loading will complete regardless. Any missing class will then be looked up within the module context, so any shaded/embedded dependency will be picked up.