clearbluejar / ghidrecomp

Python Command-Line Ghidra Decompiler
GNU General Public License v3.0
85 stars 11 forks source link

ParralelDecompiler Faster? #1

Open clearbluejar opened 1 year ago

clearbluejar commented 1 year ago

Instead of using python threading, would it be faster to use ParallelDecompiler?

        from decompiler import MyDecompileConfigurer
        from ghidra.app.decompiler.parallel import ParallelDecompiler
        from ghidra.app.decompiler.parallel import DecompilerCallback
        from ghidra.app.decompiler.parallel import DecompileConfigurer

        configurer = MyDecompileConfigurer()
        callback = DecompilerCallback(program, DecompileConfigurer)

        ParallelDecompiler.decompileFunctions(callback, program, all_funcs, None, monitor)
@JImplements(DecompileConfigurer, deferred=False)
class MyDecompileConfigurer:

    @JOverride
    def configure(self, decompiler: 'ghidra.app.decompiler.DecompInterface'):
        decompiler.toggleCCode(False)
        decompiler.toggleSyntaxTree(True)
        decompiler.setSimplificationStyle("decompile")
        opts = DecompileOptions()
        opts.grabFromProgram(p)
        decompiler.setOptions(opts)
clearbluejar commented 1 year ago

TBD.. get this error using this code:

Exception has occurred: java.lang.InstantiationException
java.lang.InstantiationException: ghidra.app.decompiler.parallel.DecompilerCallback
astrelsky commented 1 year ago

Instead of using python threading, would it be faster to use ParallelDecompiler?

Yes. "CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation)." source.

You should use the ParallelDecompiler. If you need to I've used some Java threads from python using pyhidra before with some success. Synchronization will occur via the GIL when everytime you come from java into python code though.

Edit: It's pretty cool how it automatically pasted the link for the Global Interpreter Lock. I didn't even notice it until afterwards.

clearbluejar commented 1 year ago

Thanks for this. In my case, I spin up multiple decompilers and have "multithreaded" python submit each function to several running decompilers, so perhaps the GIL issue is mitigated?

I had trouble using pyhidra to instantiate a callback function.

callback = DecompilerCallback(program, DecompileConfigurer)

Which caused java.lang.InstantiationException: ghidra.app.decompiler.parallel.DecompilerCallback

astrelsky commented 1 year ago

DecompilerCallback

DecompilerCallback is an abstract class which you can't instantiate directly. You must use a subclass of it. Unfortunately there don't appear to be any subclasses with public visibility so you'd need to subclass it yourself and you'd have to do it in Java.

astrelsky commented 1 year ago

Thanks for this. In my case, I spin up multiple decompilers and have "multithreaded" python submit each function to several running decompilers, so perhaps the GIL issue is mitigated?

The GIL "issue" is intentional. What I think is happening is that when a Java function is called, Jpype releases the GIL which then allows another Python thread to acquire the GIL and run Python code. You will have multiple Python threads, I think, you just can't run Python code in multiple threads at the same time in the same Python process/interpreter/whatever.

clearbluejar commented 1 year ago

DecompilerCallback

DecompilerCallback is an abstract class which you can't instantiate directly. You must use a subclass of it. Unfortunately there don't appear to be any subclasses with public visibility so you'd need to subclass it yourself and you'd have to do it in Java.

I have been trying to keep it strictly python, but in this case would I need to implement the subclass as a plugin? Pyhidra does has some support to include a plugin, but I have yet to wrap my head around it.

edit: Understanding that it would need to be subclassed in Java, what is the best way to implement it? A plugin, or some 1-off jpype command like jpype.addClassPath('/my/path/myJar.jar')?

astrelsky commented 1 year ago

DecompilerCallback

DecompilerCallback is an abstract class which you can't instantiate directly. You must use a subclass of it. Unfortunately there don't appear to be any subclasses with public visibility so you'd need to subclass it yourself and you'd have to do it in Java.

I have been trying to keep it strictly python, but in this case would I need to implement the subclass as a plugin? Pyhidra does has some support to include a plugin, but I have yet to wrap my head around it.

edit: Understanding that it would need to be subclassed in Java, what is the best way to implement it? A plugin, or some 1-off jpype command like jpype.addClassPath('/my/path/myJar.jar')?

I don't know which would be best here. I'd say to just do whatever you feel is easiest. If there's not a severe problem as it currently is then I honestly wouldn't worry about it. If it ain't broke, don't fix it.