nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
Other
2.04k stars 625 forks source link

Crash using code browser UI of Ghidra #12045

Closed Palacee-hun closed 3 years ago

Palacee-hun commented 3 years ago

Ghidra is a quite powerful free and open source, mostly Java-based reverse engineering / decompiler toolbox. It's ui is Swing-based, with a custom docking widget set on top of it, but despite this, most controls are accessible. The two most important ones(the listing and decompiler areas) are not however. There's a ticket over there at Ghidra github page about that with little to no progress. I would volunteer to work on that and facilitate speedier resolution, but the crashes I report here prevent serious work in this respect. The issue is proven to be screen-reader specific, see below why. That's why I see no point in ticketing that at Ghidra issues, I bring it here instead. Ghidra may prove valuable in job settings, maybe even more in the future, for IT-security people. This augments the importance of this issue.

Steps to reproduce:

  1. Download Ghidra. It's very easy to find it.
  2. You will get a quite heavy zip package. Just unpack it anywhere. I unpacked it into c:\ghidra
  3. Before proceeding, choose a binary executable file for the following tests. It does not matter what it is. I suggest something under 200 kilobytes
  4. Launch Ghidra like this: unpackdir\ghidrarun.bat. Ghidra will come up talking.
  5. From file menu choose new project. Select non-shared project option, then name the project to something.
  6. From file menu choose import. Select your binary in the standard Java file chooser dialog. Then just tab through the import settings dialog. If your binary is a Windows or Linux executable, defaults are totally ok. Hit ok button and Ghidra will import it and show you an import results dialog. Just press ok button on it too.
  7. Some tabbing will take you to a tree view. That is the project tree. Just some arrow downs and you will hear the name of your imported file. Route the mouse cursor to it and simulate a right click on it to bring up context menu. Standard keyboard methods won't do as the item is unfocused, the tree view itself owns the focus instead.
  8. From the context menu select open with and then code browser. Shortly a dialog comes up asking if you want to analyse the binary. Choose yes.
  9. Find the analyze button in the option dialog that appears. You will need to get out from a Java table with Ctrl-tab. That table contains analysis options, but the defaults are avesome for this test (and otherwise too).
  10. Wait about at most a minute and NVDA will read the tittlebar of the code browser window. That means analysis is finished. Then press alt-f to bring up the menu to see what options you have. And then bang.

    Actual behavior:

    NVDA stops responding to anything and total silence takes place until you restart it. After hitting the restart keyboard shortcut NVDA quickly comes up talking again. Examine Ghidra windows at that point. There will be two of them: Ghidra main window and the code browser. You will find no objects in them with object nav except the close button. But it won't close if you activate that. You need to manually kill Ghidra. After killing it, it can be restarted with no problems.

    Expected behavior:

    No NVDA crash of course, this is way too troublesome.

    System configuration

    NVDA installed/portable/running from source:

    installed

    NVDA version:

    2020.3

    Windows version:

    Windows 10 64-bit 19041.746

    Name and version of other software in use when reproducing the issue:

    OpenJDK 11.0.9, Ghidra 9.2.1

    Other information about your system:

    4 gig RAM, 4 cores, very good memory statistics, very snappy and responsive system

    Other questions

    Does the issue still occur after restarting your computer?

    yes

    Have you tried any other versions of NVDA? If so, please report their behaviors.

    no

    If addons are disabled, is your problem still occuring?

    haven't tried, but I don't think this is applicable here as my installed addons don't touch Java at all

    Did you try to run the COM registry fixing tool in NVDA menu / tools?

    not applicable

    Further remarks

    This is not the only way to trigger the exact same crash in the code browser window. Note that object nav works okay and does not bring NVDA down, only certain interactions with the controls. Other crash sources are arrowing down in the symbol tree and in the listing panel (currently inaccessible). Multiple arrow down keys are needed to bring NVDA to its knees in them, but bringing up the menu instantly "slaughters" NVDA.

    How do I know that this is screen reader specific?

    There is a very hackish way to stop and restart NVDA Java support at will while running. In NVDA Python console do import JABHandler JABHandler.terminate() to stop Java support and JABHandler.initialize() to reenable it. While off you can tab and arrow around in code browser window at will, it won't bring anything down. With java-ferret tool I tested that all controls treat and fire events appropriately. After reenabling crash may be few keystrokes away again.

    Bug hunt efforts and my hints at diagnosis

    This is very tough. No exceptions, no logs whatsoever. With frida instrumentation tool I dumped the interaction between NVDA and windowsaccessbridge-32.dll multiple times in normal and crashy situations. No apparently visible hints or problem spots or erroneous patterns were identified this way. I saw that the last call to the dll was either getTopLevelObject or releaseObject, the latter being more frequent. I dug deeply into NVDA Java support code and also the ui code of Ghidra, but I could not identify anything definite. My hint at diagnosis is that something makes the Java-side of access bridge to stop responding to events sent by windowsaccessbridge-32.dll and that causes the dll to block in waiting for response and that cripples NVDA. As NVDA restart does not cure the situation on Java side (see behaviour), that points to this scenario.

tyrylu commented 3 years ago

In addition, it appears that the bug is Java version dependent, i was not able to reproduce this with 11.0.7 and NVDA Alpha 21668.

Palacee-hun commented 3 years ago

@tyrylu exactly which version of 11.0.7 have you tested with? Because in Adopt Openjdk archive (https://adoptopenjdk.net/archive.html) there are three of them: 11.0.7+10.2 (23 April 2020), 11.0.7+10.1 (17 April 2020) and 11.0.7+10 (15 April 2020). I want to also test with 11.0.7, but with the exact same version you did. For the record the exact Java version I ran Ghidra with is AdoptOpenJdk build of OpenJdk 11.0.9+11 (23 October 2020)

tyrylu commented 3 years ago

This is the output of java -version:

java version "11.0.7" 2020-04-14 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.7+8-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.7+8-LTS, mixed mode)
Palacee-hun commented 3 years ago

Thanks. That's an Oracle build then. Unfortunately I couldn't find that exact version in Oracle archives any more. So I grabbed OpenJdk 11.0.7+10 (15 April 2020) instead. As from Java 11 onwards Oracle and OpenJdk builds should be very similar, that is the closest I could get. Sad news is that I can easily crash NVDA under Ghidra code browser ui with this version also. To exclude further factors in bug hunt: The crash situation I gave the original steps to reproduce is dependent on the last widget active in code browser ui in the last Ghidra session, because Ghidra remembers that and focuses that on reopening. E.g. if it was the program tree then it won't crash with the original steps (that's why I can do so many experiments at all, otherwise it would be even more hopeless). So if you fiddled with Ghidra before (as @tyrylu obviously did) you may not get it immediately.

Palacee-hun commented 3 years ago

But there are two important scenarios which unfortunately always, 100 % reliably produce the described crash for me with all three OpenJdk builds I tested with (11.0.7, 11.0.9, 11.0.9.1) independently of where I left off with Ghidra. @tyrylu please check these. I want to make sure that it's not my computer doing something weird (highly unlikely) and also want to know if an Oracle build behaves differently (possible). Thanks much. Crash 1:

  1. Analyze something in Ghidra. After it has finished, tab to the symbol tree in code browser. You will only recognise it if you arrow down on a tree view and hear "imports", "exports" etc. These are the symbol categories recognised.
  2. Arrow down till you hear the tree item "functions". That is the list of functions found by Ghidra. Try to expand it by pressing right arrow key.
  3. Immediate NVDA crash. Crash 2:
  4. Also after analyzing something in Ghidra, in code browser tab to the listing panel. As we know it will just say "panel", as it is currently inaccessible.
  5. Press some pgdn keys in the listing to scroll down some pages.
  6. Same NVDA crash. If I temporarily suspend NVDA Java support with that nasty Python console hack I described, these won't crash NVDA of course, and Ghidra keeps on operating normally (as seen with Java-ferret tool)
tyrylu commented 3 years ago

Hello, i was at least able to reproduce a NVDA freeze using the code browser panel method, there's not much interesting debug log output, after entering the panel, the log only contains:

DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.216) - MainThread (16436):
states: enabled,focusable,visible,showing,opaque
DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.218) - MainThread (16436):
states: enabled,focusable,visible,showing
DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.219) - MainThread (16436):
states: enabled,focusable,visible,showing,opaque
DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.223) - MainThread (16436):
states: enabled,focusable,visible,showing,opaque
DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.224) - MainThread (16436):
states: enabled,focusable,visible,showing,opaque
DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.226) - MainThread (16436):
states: enabled,focusable,visible,showing,opaque
DEBUG - NVDAObjects.JAB.JAB._get_states (12:05:43.226) - MainThread (16436):
states: enabled,focusable,visible,showing,opaque
DEBUG - NVDAObjects.NVDAObject._get_placeholder (12:05:43.232) - MainThread (16436):
Potential unimplemented child class: <NVDAObjects.JAB.JAB object at 0x007ABD10>

And to add to the weirdness, a NVDA restart does not fix everything, the UI seems to be frozen as well, e. g. it is necessary to kill the java process.

Palacee-hun commented 3 years ago

Thanks. That's it, exact same behaviour. I mention for the record that that listing panel is a subclass ofthe Sving class jPanel. It doesn't do anything extraordinary, just draws the text of the listing onto the panel with Java awt calls (mostly drawString) in the custom visual layout and formatting needed for Ghidra. That's why they chose that control type. I mention also that as I began to add in neccessary accessibility properties in my Ghidra copy, they indeed work as documented, and NVDA announces more and more things, but the crashes remain the same. I've never ever experienced anything like this in the 10 years I've been working with this excellent screen reader. This is a real toughie. Now I am almost confident that this is a Java-side problem, and lies somewhere in the deep layers of Java awt implementation. I have some more ideas to test, but it will be no child play to ever uncover its real origin.

Adriani90 commented 3 years ago

cc: @feerrenrut

lukaszgo1 commented 3 years ago

Have you tested this particular scenario wit JAWS just to make sure that it is a NVDA problem rather than an Java Access Bridge one?

Palacee-hun commented 3 years ago

No, because I haven't used Jaws for more than 10 years now and I'd rather keep that away from my computers. Nevertheless this is a good idea, and a logical step in the bug hunt. so those who use Jaws regularly and are willing to try the crash scenarios, please do so and please comment your findings.

Palacee-hun commented 3 years ago

With the help of the very powerful Frida instrumentation tool I managed to peek at the debug messages that the Java-side of the access bridge produces. And with those I managed to diagnose this group of various crashes. The root cause lies with a pathological scroll bar. It is a Java swing AccessibleJScrollbar instance with most parameters okay, however its index in accessible parent is reported as -1 (a totally out-of-order value as it should be 0-based), and it has some serious problems with its accessible parenting. The course that leads to the crashes is as follows:

Version Information: Java virtual machine version: 11.0.7 Access Bridge Java class version: 11.0.7 Access Bridge Java DLL version: 11.0.7 Access Bridge Windows DLL version: 11.0.7

AccessibleContext information: Name:
Virtual Name:
Description:
Role: scroll bar Role in en_US locale: scroll bar States: enabled,focusable,visible,showing,opaque,vertical States in en_US locale: enabled,focusable,visible,showing,opaque,vertical Index in parent: -1 Children count: 2 Bounding rectangle: [530, 159, 547, 729] ERROR: GetAccessibleContextInfo failed for top-level window


There are some fields after this, but they are pointless as the freeze intervened in normal querying.
From this I know that any screen reader will be affected by this, unless it implements extra sanity checks for such objects with "bad lineage". My proposition for a quick fix in NVDA for all this is to avoid getting the  parent accessible context of Java objects which have a negative index in accessible parent property either directly or indirectly (e.g. by calling getTopLevelObject on them) and to treat them as top level objects with no parents.
Palacee-hun commented 3 years ago

The following two simple fixes in source\JABHandler.py completely fix all these crashes. I extensively tested them, experienced no side-effects with other Java apps and regressions because of these fixes are very very improbable:

def internal_getWindowHandleFromAccContext(vmID,accContext):
    try:
# 12045: some broken accessible objects contradict themselves by reporting no parent
# (indexInParent = -1) and also reporting themselves as parents
# do not try to call getTopLevelObject on such objects, as this would create a Java-side endless
# loop, but get native window handle from them directly
        info=AccessibleContextInfo()
        bridgeDll.getAccessibleContextInfo(vmID,accContext,byref(info))
        if info.indexInParent < 0:
            return bridgeDll.getHWNDFromAccessibleContext(vmID,accContext)
        topAC=bridgeDll.getTopLevelObject(vmID,accContext)
...
    def getAccessibleParentFromContext(self):
        accContext=bridgeDll.getAccessibleParentFromContext(self.vmID,self.accContext)
# 12045: some broken objects report themselves as their own parent
# treat such objects as having no parent to prevent recovering from pointless
# recursions, broken object navigation etc.
        if accContext and not bridgeDll.isSameObject(self.vmID,self.accContext,accContext):
            return self.__class__(self.hwnd,self.vmID,accContext)
        else:
            return None
Palacee-hun commented 3 years ago

With most elaborate work I managed to track down the very core of these freezes. The NVDA workaround code I had shared above was very much needed for that as a frozen screen reader is not particularly useful for serious debugging purposes. This was a Ghidra bug. Technically one tiny method overrode AccessibleContext retrieval for a vertical scrollbar subclass with a specific visual appearance, but it did it with a mixed-up delegation pattern and that caused that scroll bar to report its parent erroneously as itself. Removing that single method fixed this all. The fix has been committed to Ghidra master. Nevertheless putting my workaround code into NVDA would do no harm and would only bring benefits: protection against such edge-cases.

lukaszgo1 commented 3 years ago

Nevertheless putting my workaround code into NVDA would do no harm and would only bring benefits: protection against such edge-cases.

Have you considered to submit a pr yourself?

Palacee-hun commented 3 years ago

No and I won't do that myself. Reasons are simple: I have absolutely no experience with that. I've never ever made a pull request. I've read through many tutorials, but somehow I couldn't understand the Github "bureaucracy" with pull requests and why so many steps are needed just to put a very tiny fix in place. I am very strong at tracking down and fixing very hard to find bugs (like this) even in extreme conditions (like when the screen reader crashes) and that is a huge contribution. There are people here who make pull requests daily. I don't want to mess with NVDA with my first ever pull request.

feerrenrut commented 3 years ago

It's important that someone who truly understands the problem and required change creates the Pull Request, since it is the mechanism for communicating the intent, reasoning and approach to the rest of the developers on the project.

I'd be happy to provide guidance with creating the PR if you would like. The process is very safe, and the community is friendly.

MarcoZehe commented 3 years ago

CC'ing @Zersiax because didn't you recently also experiment/fight with some JAVA-related NVDA crashes and freezes? Are they related maybe?

Palacee-hun commented 3 years ago

No, that was the only Java-related crash/freeze I've ever encountered with NVDA during the more than 10 years since I use NVDA exclusively.

zersiax commented 3 years ago

@MarcoZehe you may be onto something there. My issue was with a control that wasn't supposed to be clickable receiving an NVDA+enter , which in Java apps I believe sends a left click. I never dug into it as deeply as the OP has, but it is very possible that that control had a negative index in some sense, or was the child of a parent that did. As for fixing this, while I agree with the fact that a PR should be submitted by the original poster, I also recognize this patch has been sitting here in the comments for the last 2 months. Has a PR been created for this issue? If not, how can we move forward in getting the supplied patch tested/integrated?

Palacee-hun commented 3 years ago

In the interest of the NVDA community I kindly, but definitely ask that someone who has a daily routine of creating pull requests submit one with my clearly commented patch above if it seems useful, because:

  1. I find it quite strange that I and only I should create a pull request for the issue I have reported and tracked down in its entirety. I haven't ever met this rigorous practice at any other repo I've ever submitted an issue for. At other repos when I had clearly pointed out what to fix the pr was created within hours, at most days and the issue usually got closed as well without my having to ask anything. This was the case at Ghidra folks with the Java part of this issue too (closed months ago).
  2. The community should be much more lenient in these times we are through now. I am far from my "regular" mental shape due to hardships caused by lockdown regulations in my country during the second and third waves of the Covid pandemic (during the last nine months). Much simpler tasks cause significant trouble now than learning and experimenting through my first ever Github pull request - an overcomplicated process. Our extraordinary times do not require rigid rules in some cases, but solidarity instead and flexibility.
XLTechie commented 3 years ago

It is not about rigid rules, as you say. It is a practical matter that very few other people can test your fixes, or know what to look for, as well as you do; and I suspect you are one of a very small number currently using this software. Keep in mind that generating a pull request does not do anything to the NVDA source. It will be reviewed by one or more core developers before anything happens with it. You can't do any harm here. An executable will be generated in which your fix can be tested (by you and others).

The process of making a pull request, with all your experience, should be simple. It only looks difficult and complex until you try it. Here are some steps you can follow.

  1. After you log into your GitHub account, go to: https://github.com/nvaccess/nvda
  2. Select fork, and fork the repository into your account.
  3. On your local machine where you have installed git, run: git clone https://github.com/Palacee-hun/nvda. You can also use the following, if you have uploaded an ssh key to GitHub: git clone ssh://git@github.com/Palacee-hun/nvda.
  4. cd nvda
  5. git submodule update --init
  6. git remote add nvaccess https://github.com/nvaccess/nvda
  7. git fetch nvaccess
  8. git checkout master
  9. git branch -u nvaccess/master
  10. git pull
  11. git checkout -b javaFix
  12. You are now in a branch called javaFix. You could call that anything you want, this is just an example. Do your file edits here, fixing whatever you need to with your code above.
  13. When you have finished your edits, run git commit -a
  14. An editor will open in which you can describe what you have done and why, in a git commit. Use a one line summary at the top, and add additional details below if needed. After you close the editor, continue below.
  15. Run git push -u origin/javaFix. If you didn't use "javaFix" as your branch name, change it here.
  16. The output of that last git command should provide a URL which you can use to make a pull request (it will be on one of the last few "remote" lines). Go to that URL.
  17. That URL should have already configured the pull request for you. Edit the title as needed.
  18. Edit the body comment of the pull request, filling out the template provided.
  19. Below the "submit pull request" button there is a menu. Open that menu and choose "Draft pull request".
  20. Above the menu the button should have changed to "Create draft pull request" or something similar. Push that button.

That should have created a pull request. We can guide you further from there.

Note: the first several steps of the above (before 11) are taken more or less directly from the NVDA Wiki page on Contributing.

XLTechie commented 3 years ago

Also, in your "patch" code (not an actual appliable patch), you need to indent your comment lines to the same level as your code. In other words, the comments in the first function need to be indented one tab, and the comments in the second function need to be indented two tabs.

Palacee-hun commented 3 years ago

Great guide for making a pull request, thanks much, I may find good use of that info somewhen ... Definitely not appetizing though lol Nevertheless totally independent of this I've decided to close this issue. As it turned out this was not a NVDA bug, but a Java bug in Ghidra. Ghidra folks fixed that after my diagnosis, thorough testing of the fix candidate and finalizing my fix proposal. The fix is in Ghidra 10.0 beta according to changelog and that version is already downloadable. So this issue is properly dealt with and gone. All other Java issues with NVDA should be discussed in separate issues.