Closed fmagin closed 5 years ago
Cheers for hitting the first issues, mate :)
Running your python in my current Jython env, I get 0x794f2L
. Running in the bridge, I get 0x794f2
(minus the logging cruft from #1 ). The trailing L in Jython is an artifact of it being Python 2 and using long integers - it doesn't happen in Py3 because longs are no longer a thing.
Is that the difference you're seeing, or are you seeing something much more functionally different? If so, do you mind shooting me some more examples, including output?
Oh, I should have specified that, sorry:
the bridge gives me 0x100000
, which is the base address of the loaded binary, while the Jython shell gives me whatever the current cursor position is.
Huh. That is weird, definitely not reproducing.
To rule out the simplest possibility, are you running multiple tools in Ghidra (e.g., multiple code browser windows) or multiple programs loaded into Ghidra? What does getState().getCurrentProgram().getName() give you on both sides?
To clarify: I am on Ghidra 9.0.1 currently, I will investigate if this accounts for this. The docs stated 9.0.1 last night so I didn't check that.
getState().getCurrentProgram().getName()
gives me 'reverse' (i.e. the expected name of the binary) for both.
One wild guess would be that the windows looses focus and thus ghidra looses the cursor position somehow? I have no clue about ghidra internals, are there other functions related to the current Position to check this?
Narrowed it down somewhat: It is not fixed the the base address, but to whatever the first result is. Maybe when the bridge is initialized
Okay, I'll try to replicate that more specifically. I know that the currentAddress variable from the flat API is stuck to the address when the script was invoked, but I thought getState() synced each time it was called.
On Tue., 9 Apr. 2019, 18:29 Florian Magin, notifications@github.com wrote:
Narrowed it down somewhat: It is not fixed the the base address, but to whatever the first result is. Maybe when the bridge is initialized
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/justfoxing/ghidra_bridge/issues/2#issuecomment-481155110, or mute the thread https://github.com/notifications/unsubscribe-auth/ARp7IPzVmuYI4w3NYIc4ZgEmB6-CbbvYks5vfE91gaJpZM4cjAoq .
suggestion to get currentAddress (and others) to work: I had a similar issue when writing my ipython kernel for binary ninja , I wrote an IPython Extension to register a hook that was run before each cell. So if we find a way to get getState() to work properly, I could port this to an IPython extension that updates those variables before each cell from ghidra
How do I check if getState is correctly syncing? I can look into this a bit in a few hours.
Mostly, the result of getState().getCurrentAddress() should be up to date. The other get* functions on the State object should also return updated results, so maybe try adding another program to the project, opening it in the code browser, and seeing if getProgram() respects swapping back and forth.
On Tue., 9 Apr. 2019, 19:02 Florian Magin, notifications@github.com wrote:
How do I check if getState is correctly syncing? I can look into this a bit in a few hours.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/justfoxing/ghidra_bridge/issues/2#issuecomment-481166588, or mute the thread https://github.com/notifications/unsubscribe-auth/ARp7IE6Bgwhf8EBK18t4L5RpGjwnxmwZks5vfFcHgaJpZM4cjAoq .
Okay, finally had time to take a proper look. Good news is I can replicate it now. Bad news is, I'm pretty sure this is a Ghidra issue, so outside the scope of this project.
What appears to be happening is that when called inside a Python script (but not the python interpreter), getState() is not syncing/updating but is instead returning the same state each time (I assume the initial state variable). See this by running str(getState())
multiple times. In the interpreter, I get a different state object each time:
>>> str(getState())
'ghidra.app.script.GhidraState@4a448520'
>>> str(getState())
'ghidra.app.script.GhidraState@1f9f6539'
>>> str(getState())
'ghidra.app.script.GhidraState@58e50e17'
but in both a script run directly from the script manager, and running across the ghidra_bridge, I get the same state object:
>>> str(getState())
'ghidra.app.script.GhidraState@75c0f6cf'
>>> str(getState())
'ghidra.app.script.GhidraState@75c0f6cf'
>>> str(getState())
'ghidra.app.script.GhidraState@75c0f6cf'
Further, modifications to the state work and are reflected in the GUI - but the moment you call getState() again, it goes back to the original values (and the GUI updates to jump back to them).
>>> s = getState()
>>> str(s.getCurrentAddress())
'00007154'
>>> s.setCurrentAddress(s.getCurrentProgram().getAddressFactory().getAddress("0x31337"))
>>> str(s.getCurrentAddress())
'00031337'
>>> # confirmed in GUI - addr is 31337
>>> str(getState().getCurrentAddress())
'00007154'
>>> # confirmed in GUI - addr back to 7154
Given that this is occurring in a script run directly from the script manager, without the bridge, I've got to go with this being a bug in Ghidra. I'll leave this issue open a little longer in case you've got another thought - otherwise, feel free to report the issue to them if you want (I feel like you found the original bug here, so you should get the cred - feel free to use any of my notes). If not, I'll look into reporting it when I get ghidra_bridge stable.
So, having dug into the relevant Ghidra code a bit more, I'm actually feeling like this isn't a Ghidra bug so much as not-well-documented but expected behaviour. The following is mostly just me documenting this for posterity/so I can close a bunch of tabs in my browser.
What's going on is that for a standard script, it receives the state when it's told to execute, initializes the currentAddress/current*/etc, and then never really changes the state or those variables afterwards (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/app/script/GhidraScript.java#L196).
When getState() is called, it calls updateStateFromVariables() (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/app/script/GhidraScript.java#L498), which in turn resets all the values in the state object to match currentAddress/current*/etc (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/app/script/GhidraScript.java#L472) - this is the "syncing" behaviour described in the doco.
So far, so good - this explains the behaviour of calling getState() changing the address back to the original address - if currentAddress is fixed at what it was when the script was called, the state object will be set back to that, which in turn triggers an event in GhidraState (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/app/script/GhidraState.java#L142) which is presumably causing the GUI to update. This is also why when scripts/ghidra_bridge repeatedly call getState() they get the same value - getState() just returns the same state object each time.
But if the state/current* variables don't change, what's going on with in the Jython shell? The relevant guts start in PythonPluginInputThread, which is where each line of python gets read in and executed. This spins up a new PythonPluginExecutionThread for the line (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Python/src/main/java/ghidra/python/PythonPluginInputThread.java#L80) which in turn generates a new GhidraState with updated currentAddress/etc (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Python/src/main/java/ghidra/python/PythonPluginExecutionThread.java#L72). Essentially, each line of Python in the shell interpreter is being treated as its own GhidraScript invocation, allowing the state/address variables to update (and so we see getState() return different state objects each time it's called, because they're created new for each line).
So what to do about it?
The PythonPluginExecutionThread code demonstrates how to get the current variables from the PythonPlugin object (which is a ProgramPlugin object, which is apparently where all the current variables come from (https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Features/Base/src/main/java/ghidra/app/plugin/ProgramPlugin.java#L56)). We can access this by getting the PluginTool from state.getTool() (this is the main environment you've got open, such as CodeBrowser, so it's not going to change), and in turn calling PluginTool.getManagedPlugins(), which returns a list of plugins, which we can loop through until we find the one named "PythonPlugin" (there may be other suitable ProgramPlugin objects, but we're running Python, so we know PythonPlugin should be there).
Because ghidra_bridge is responsible for grabbing the getState() function and injecting it into the client namespace, we can just replace it with our own function (I'll probably specify a flag for "interactive mode", since I don't want to change this behaviour for everything - there may be scripts out there that rely on currentAddress/etc not changing despite the script messing around). This getState_fix() would simply update the values using the PythonPluginExecutionThread technique, set them into the state object, and call it a day (without needing to call the original getState()). This would allow users to get most of the interactive experience - getState() would give them up-to-date values, and wouldn't reset the location in the GUI.
To be more like the shell interpreter, it'd be nice for currentAddress/etc to be in sync with the current state. The core problem here is that these values are imported when __main__ is imported, and then are just values - they're not part of a larger object, so can't be easily replaced with property() methods to call a function when they're queried.
Possible solution: We've seen GhidraState triggers events when it's updated - looks like these are received by listeners registered against the tool. If I can work out how to create a suitable Jython object that implements the ToolListener interface from Python, that could be registered to update the current* variables when they change.
Possible other solution: Special case these variables. Because they're all objects (not primitive values), they're returned as ghidra_bridge.bridge.BridgedObjects. I could add a method to this class to register a function to run every time a get/set is run against a BridgedObject, allowing the remote object to be updated and replaced before the get/set is carried out. Then, when the GhidraBridge client is about to inject these special current* values into the namespace, it can first register the update functions (iff it's running in interactive mode).
I got another behavior where the Jython shell returns different values than the Bridge.
I was implementing an Iterator for functions: (for debugging purpose i added the additional variable)
def func_iterator(funcs):
got_next = funcs.hasNext()
while got_next:
yield funcs.next()
got_next = funcs.hasNext()
print(got_next)
b = ghidra_bridge.GhidraBridge(namespace=globals())
listing = currentProgram.getListing()
for func in func_iterator(listing.getFunctions(0)):
print(func)
This does work perfectly in the Jython shell with the following output:
FUN_00081388
True
FUN_00081360
True
FUN_00081168
True
FUN_0008103c
False
As FUN_0008103c
is the last function in the list..
However running it over the bridge:
FUN_00081360
True
FUN_00081168
True
FUN_0008103c
True
Then Java will of course throw an NullPointerException: java.lang.NullPointerException
when checking for the next Object.
Further :-) it would be super improvement if your bride is returning a valid Iterator when the Object contains an Iterator.
@Traxes - yup, sounds legit. I've just pulled it into its own issue for easier tracking (I'm just about done on the fix for this one, and need to finish it up first)
Still working on this one - I thought I was close, but the fix has required me to rearchitect the entire comms model, so I'm still chasing down some bugs and performance issues in that.
Fixed in 0.0.3, once I get that uploaded. There's an auto-detecting interactive mode, which should turn on in an ipython or python interpreter environment (or you can create the GhidraBridge(interactive_mode=True)).
getState and currentAddress (and the other current* variables) should now behave the same as in the Ghidra Python interpreter, reflecting any changes in the GUI.
Uploaded. Grab the latest version and hopefully that'll do the trick.
Running
hex(getState().getCurrentAddress().getOffset())
in the Jython shell gives a different result than runningin the bridge.
I have no idea why, if you cant reproduce this on windows I can investigate this further.