nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.13k stars 639 forks source link

NVDA freeze while reloading the configuration #14872

Open CyrilleB79 opened 1 year ago

CyrilleB79 commented 1 year ago

here is a freeze that occurred when reloading the configuration. This issue does not occur each time.

Steps to reproduce:

Press NVDA+control+R

Actual behavior:

NVDA freezes. Need to be relaunched with Kill NVDA add-on.

Expected behavior:

No freeze

NVDA logs, crash dumps and other attachments:

Piece of log while it is frozen:

Python stack for thread 3940 (hwIo.ioThread.IoThread):
  File "threading.pyc", line 890, in _bootstrap
  File "threading.pyc", line 926, in _bootstrap_inner
  File "hwIo\ioThread.pyc", line 91, in run
  File "hwIo\ioThread.pyc", line 69, in apc
  File "braille.pyc", line 2571, in _bgThreadExecutor
  File "brailleDisplayDrivers\eurobraille.pyc", line 409, in display
  File "brailleDisplayDrivers\eurobraille.pyc", line 386, in _sendPacket
  File "brailleDisplayDrivers\eurobraille.pyc", line 402, in _sendHidPacket
  File "hwIo\base.pyc", line 120, in write

Python stack for thread 14852 (MainThread):
  File "nvda.pyw", line 427, in <module>
  File "core.pyc", line 756, in main
  File "wx\core.pyc", line 2237, in MainLoop
  File "gui\__init__.pyc", line 768, in Notify
  File "core.pyc", line 711, in run
  File "queueHandler.pyc", line 88, in pumpAll
  File "queueHandler.pyc", line 55, in flushQueue
  File "core.pyc", line 220, in resetConfiguration
  File "braille.pyc", line 2637, in terminate
  File "braille.pyc", line 1990, in terminate
  File "brailleDisplayDrivers\eurobraille.pyc", line 225, in terminate
  File "braille.pyc", line 2728, in terminate
  File "brailleDisplayDrivers\eurobraille.pyc", line 409, in display
  File "brailleDisplayDrivers\eurobraille.pyc", line 386, in _sendPacket
  File "brailleDisplayDrivers\eurobraille.pyc", line 402, in _sendHidPacket
  File "hwIo\base.pyc", line 120, in write

I do not know the braille code at all. But having two threads trying to send data at the same time does not seem a good idea.

System configuration

NVDA installed/portable/running from source:

Installed

NVDA version:

2023.1

Windows version:

Windows 10 21H2 (AMD64) build 19044.2846

Name and version of other software in use when reproducing the issue:

N/A

Other information about your system:

N/A

Other questions

Does the issue still occur after restarting your computer?

I do not know since this issue cannot be reproduced each time.

Have you tried any other versions of NVDA? If so, please report their behaviors.

No

If NVDA add-ons are disabled, is your problem still occurring?

Not tested

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

Not tested; probably not related

CyrilleB79 commented 1 year ago

Cc @leonardder

Brian1Gaff commented 1 year ago

I noticed this sometimes as well, it can happen after an add on update or an nvda update, but in the main a complete computer restart seems to be able to fix it for a time. As you say, its random, and in most cases seems to afflict portable versions more than installed ones. I was musing it might be Windows holding onto data rather than updating it. Brian

-- @. Sent via blueyonder.(Virgin media) Please address personal E-mail to:- @., putting 'Brian Gaff' in the display name field. ----- Original Message ----- From: "Cyrille Bougot" @.> To: "nvaccess/nvda" @.> Cc: "Subscribed" @.***> Sent: Tuesday, April 25, 2023 1:13 PM Subject: [nvaccess/nvda] NVDA freeze while reloading the configuration (Issue #14872)

here is a freeze that occurred when reloading the configuration. This issue does not occur each time.

Steps to reproduce:

Press NVDA+control+R

Actual behavior:

NVDA freezes. Need to be relaunched with Kill NVDA add-on.

Expected behavior:

No freeze

NVDA logs, crash dumps and other attachments:

Piece of log while it is frozen:

Python stack for thread 3940 (hwIo.ioThread.IoThread):
 File "threading.pyc", line 890, in _bootstrap
 File "threading.pyc", line 926, in _bootstrap_inner
 File "hwIo\ioThread.pyc", line 91, in run
 File "hwIo\ioThread.pyc", line 69, in apc
 File "braille.pyc", line 2571, in _bgThreadExecutor
 File "brailleDisplayDrivers\eurobraille.pyc", line 409, in display
 File "brailleDisplayDrivers\eurobraille.pyc", line 386, in _sendPacket
 File "brailleDisplayDrivers\eurobraille.pyc", line 402, in _sendHidPacket
 File "hwIo\base.pyc", line 120, in write

Python stack for thread 14852 (MainThread):
 File "nvda.pyw", line 427, in <module>
 File "core.pyc", line 756, in main
 File "wx\core.pyc", line 2237, in MainLoop
 File "gui\__init__.pyc", line 768, in Notify
 File "core.pyc", line 711, in run
 File "queueHandler.pyc", line 88, in pumpAll
 File "queueHandler.pyc", line 55, in flushQueue
 File "core.pyc", line 220, in resetConfiguration
 File "braille.pyc", line 2637, in terminate
 File "braille.pyc", line 1990, in terminate
 File "brailleDisplayDrivers\eurobraille.pyc", line 225, in terminate
 File "braille.pyc", line 2728, in terminate
 File "brailleDisplayDrivers\eurobraille.pyc", line 409, in display
 File "brailleDisplayDrivers\eurobraille.pyc", line 386, in _sendPacket
 File "brailleDisplayDrivers\eurobraille.pyc", line 402, in _sendHidPacket
 File "hwIo\base.pyc", line 120, in write

I do not know the braille code at all. But having two threads trying to send data at the same time does not seem a good idea.

System configuration

NVDA installed/portable/running from source:

Installed

NVDA version:

2023.1

Windows version:

Windows 10 21H2 (AMD64) build 19044.2846

Name and version of other software in use when reproducing the issue:

N/A

Other information about your system:

N/A

Other questions

Does the issue still occur after restarting your computer?

I do not know since this issue cannot be reproduced each time.

Have you tried any other versions of NVDA? If so, please report their

behaviors. No

If NVDA add-ons are disabled, is your problem still occurring?

Not tested

Does the issue still occur after you run the COM Registration Fixing

Tool in NVDA's tools menu? Not tested; probably not related

-- Reply to this email directly or view it on GitHub: https://github.com/nvaccess/nvda/issues/14872 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

seanbudd commented 1 year ago

There's been various fixes to hwIO since this was first reported. Is this still an issue on the latest alpha?

Possible fix: #14924

Brian1Gaff commented 1 year ago

It sometimes did this on Tuesdays download, but as my main computer has now died, I cannot test till a new motherboard is installed. Brian

@. Sent via blueyonder. Please address personal email to:- @., putting 'Brian Gaff' in the display name field. This message sent from a Windows XP machine! ----- Original Message ----- From: Sean Budd To: nvaccess/nvda Cc: Brian Gaff ; Comment Sent: Friday, June 23, 2023 6:42 AM Subject: Re: [nvaccess/nvda] NVDA freeze while reloading the configuration (Issue #14872)

There's been various fixes to hwIO since this was first reported. Is this still an issue on the latest alpha?

- Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

CyrilleB79 commented 1 year ago

I have tried a quick test but braille does not seem to work on last alpha (portable) on my computer. It's an Esys80 (Eurobraille). @leonardder any idea? Have you by chance tested this type of material?

seanbudd commented 1 year ago

Hi @CyrilleB79 - can you test with this build and see if you can still reproduce the issue?

I'm also concerned about the fact that braille is not working for your device and alpha, do you think you can find the git blame for that issue?

https://ci.appveyor.com/api/buildjobs/gofnn1vdjb44d9h1/artifacts/output%2Fnvda_snapshot_try-fix-14872-28527%2C83b07642.exe

CyrilleB79 commented 1 year ago

@seanbudd, I have opened #15079 and identified the commits that have caused my braille display not to work anymore.

Once #15079 is fixed, I may try again to reproduce the current issue.

seanbudd commented 1 year ago

I've built another try build - Cyrille would you be able to let us know if this build or the latest alpha solves your issue

https://ci.appveyor.com/api/buildjobs/p9mh3h1s67rdphr7/artifacts/output%2Fnvda_snapshot_try-fix-14872-28585%2C0c25e80e.exe

CyrilleB79 commented 1 year ago

I have retested with nvda_snapshot_alpha-28584,5c4e6e1f.exe (temp version from launcher, add-ons disabled).

I have succeeded in reproducing the issue. It's not easy to reproduce it however; many attempts needed. Here is the log: nvda-old.log

seanbudd commented 1 year ago

Ah, that's unsurprising. Is it an issue with the try-build though?

CyrilleB79 commented 1 year ago

Tested with nvda_snapshot_try-fix-14872-28585,0c25e80e.exe..

I have tried many times to reproduce the issue (since it only occurs rarely). Two times I have had a freeze or crash when reloading the config but no stack states logged in the log.

Finally, I have reproduced the issue as follows:

NVDA crashes with the following log: nvda.log

No dumpfile is generated.

seanbudd commented 1 year ago

Hi @CyrilleB79 thanks for testing the previous build. It seems that that approach failed, causing a deadlock between the 2 threads trying to acquire a lock to write directly to the display.

I've tried another approach, instead of trying to directly write during a reset, instead we clear the braille buffer and update. https://ci.appveyor.com/api/buildjobs/gk99tetglcqib866/artifacts/output%2Fnvda_snapshot_try-fix-14872-28669%2C1d96df95.exe

Could you test this approach instead?

CyrilleB79 commented 1 year ago

OK. I will be able to test only next week at the earliest.

CyrilleB79 commented 1 year ago

With nvda_snapshot_try-fix-14872-28669,1d96df95.exe, I have tried to reset the config a few times pressing NVDA+control+R.

The penultimate time I have pressed NVDA+control+R, I got this error in the log:

ERROR - braille.BrailleDisplayDriver.terminate (10:37:48.903) - ThreadPoolExecutor-2_0 (13396):
Display driver BrailleDisplayDriver('noBraille', numCells=0) failed to display while terminating.
Traceback (most recent call last):
  File "braille.pyc", line 2819, in terminate
AttributeError: 'NoneType' object has no attribute 'buffer'

The last time I have pressed NVDA+control+R, NVDA crashed with the Windows dialog "NVDA application has stopped working".

Here is the full log: nvda_crash.log

CyrilleB79 commented 1 year ago

@seanbudd, why has this been postponed to 2023.3?

NVDA freezing is quite concerning, especially for a newly introduced feature (braille rework). Or do you think it is not linked to the recent rework?

@leonardder any opinion on this?

Since we can have two threads able to freeze NVDA due to concurrent write requests that may occur when reloading the

LeonarddeR commented 1 year ago

I think for this to be fixed properly, we really need to know what commit introduced the issue. As for the postponement to 2023.3, this issue isn't a regression in the current release, so I guess that's the rationale. Note that writing to a display from multiple threads shouldn't be a real concern as I/O is supposed to be threadsafe.

It looks like there is no longer a terminate timeout when terminating the bg thread. I think I added that timeout back in the far past because braille display auto detection was also using the thread. Since BD Detect no longer uses the thread, I removed the timeout again just as before. We could reintroduce the timeout, but forcefully killing a thread that isn't responding doesn't feel like the right thing to do.

seanbudd commented 1 year ago

Yes - this isn't a regression. It would be a nice to have fix, but it is not going to block the 2023.2 release.

LeonarddeR commented 1 year ago

@CyrilleB79 Could you reproduce this after first executing import NVDAState; NVDAState._allowDeprecatedAPI = lambda: False at the Python console?

CyrilleB79 commented 1 year ago

@CyrilleB79 Could you reproduce this after first executing import NVDAState; NVDAState._allowDeprecatedAPI = lambda: False at the Python console?

According to NVDAState._allowDeprecatedAPI's docstring:

Note that deprecated code may be imported at runtime, and as such, this value cannot be changed at runtime to test compliance.

So I should probably modify the return of this function and run from source, rather modifying it at runtime in the console, shouldn't I? Indeed, many code instructions are evaluated when modules are first imported during NVDA startup, so before I have a chance to modify the function in the console...

LeonarddeR commented 1 year ago

Good point actually. I think from hwIo import ioThread; ioThread._apcsWillBeStronglyReferenced = False should work at runtime though. It should weakly reference every new APC. That said, I think #15232 is more likely to fix this.

CyrilleB79 commented 1 year ago

I have tried with last beta branch (commit e15fe8ab11d6583c3e8d82809305170b2eb3790b), run from source and forcing NVDAState._allowDeprecatedAPI to return False. No more success, i.e. reloading the config sometimes works, sometimes crashes Python.

Same unsuccessful result with the following modification in hwIo.ioThread.py:

-_apcsWillBeStronglyReferenced = version_year < 2024 and NVDAState._allowDeprecatedAPI()
+_apcsWillBeStronglyReferenced = False

At last, tried this modification on top of #15232 (commit 5b39cd8d69b05665205bfa698709175445cff286), run from source; no change, i.e. resetting the config sometimes works, sometimes crashes Python.

With this last config, I have also got once the following garbage handler error at the end of a successful config reset:

INFO - core.resetConfiguration (10:17:17.791) - MainThread (1280):
Reverted to saved configuration
IO - speech.speech.speak (10:17:17.806) - MainThread (1280):
Speaking ['Configuration appliquée']

WARNING - garbageHandler.notifyObjectDeletion (10:17:17.846) - ThreadPoolExecutor-3_0 (6252):
Garbage collector has found one or more unreachable objects. See further warnings for specific objects.
Stack trace:
  File "C:\Users\cb232690\AppData\Local\Programs\Python\Python37-32\lib\threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "C:\Users\cb232690\AppData\Local\Programs\Python\Python37-32\lib\threading.py", line 926, in _bootstrap_inner
    self.run()
  File "C:\Users\cb232690\AppData\Local\Programs\Python\Python37-32\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\cb232690\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\thread.py", line 80, in _worker
    work_item.run()
  File "C:\Users\cb232690\AppData\Local\Programs\Python\Python37-32\lib\concurrent\futures\thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "bdDetect.py", line 399, in _bgScan
    for driver, match in iterator:
  File "extensionPoints\__init__.py", line 274, in iter
    for value in iterable:
  File "bdDetect.py", line 347, in _bgScanUsb
    for driver, match in getDriversForConnectedUsbDevices():
  File "bdDetect.py", line 162, in getDriversForConnectedUsbDevices
    for port in deviceInfoFetcher.hidDevices
  File "baseObject.py", line 62, in __get__
    return instance._getPropertyViaCache(self.fget)
  File "baseObject.py", line 168, in _getPropertyViaCache
    val=getterMethod(self)
  File "bdDetect.py", line 282, in _get_hidDevices
    return list(hwPortUtils.listHidDevices(onlyAvailable=True))
  File "hwPortUtils.py", line 537, in listHidDevices
    class SP_DEVICE_INTERFACE_DETAIL_DATA_W(ctypes.Structure):
  File "hwPortUtils.py", line 540, in SP_DEVICE_INTERFACE_DETAIL_DATA_W
    ('DevicePath', WCHAR*(dwNeeded.value - ctypes.sizeof(DWORD))),
  File "garbageHandler.py", line 28, in __del__
    notifyObjectDeletion(self)
  File "garbageHandler.py", line 69, in notifyObjectDeletion
    stack_info=True
WARNING - garbageHandler.notifyObjectDeletion (10:17:17.846) - ThreadPoolExecutor-3_0 (6252):
Deleting unreachable object <braille.BrailleHandler object at 0x0E02FD50>
WARNING - garbageHandler.notifyObjectDeletion (10:17:17.846) - ThreadPoolExecutor-3_0 (6252):
Deleting unreachable object <braille.BrailleBuffer object at 0x0E02FD10>
WARNING - garbageHandler.notifyObjectDeletion (10:17:17.846) - ThreadPoolExecutor-3_0 (6252):
Deleting unreachable object <braille.BrailleBuffer object at 0x0E02FE90>
WARNING - garbageHandler.notifyObjectDeletion (10:17:17.846) - ThreadPoolExecutor-3_0 (6252):
Deleting unreachable object <winConsoleHandler.WinConsoleTextInfo object at 0x0E058B70>
ERROR - garbageHandler._collectionCallback (10:17:17.846) - ThreadPoolExecutor-3_0 (6252):
Found at least 4 unreachable objects in run

I do not know if it is of any use.

seanbudd commented 11 months ago

@CyrilleB79 - could you test this try build?

It avoids calling HidD_FreePreparsedData for eurobraille devices. This might introduce new issues, I am just hoping to confirm that this is the cause of the freeze.

CyrilleB79 commented 11 months ago

@seanbudd, here is the log: nvda_2023-12-19_09-57-30.log

Reinitializing the configuration works many times. And the last time, it crashes NVDA with two Windows Error (app stopped working) windows:

Unless of course if one of these two windows was one of my previous attempts that I had forgotten to close in the background.

seanbudd commented 10 months ago

I think this means the HIDFreeParsedData call is not the cause here. I think we need to do further investigation and potentially an approach like #15849

I'm going to bump this to the next milestone as getting an approach for this release seems unlikely.

seanbudd commented 7 months ago

The priority here has been lowered as only one person has been able to reproduce this

Adriani90 commented 7 months ago

This might be reproducible only with a braille display connected. cc: @bramd, @dkager, @aaclause, @burmancomp, @irrah68, @Jykke67 can you please test if you can reproduce this freezein NVDA 2024.1 or last alpha?

burmancomp commented 7 months ago

I tried several times to reload configuration, and NVDA (source based on current main branch) did not freeze.

CyrilleB79 commented 7 months ago

@burmancomp wrote:

I tried several times to reload configuration, and NVDA (source based on current main branch) did not freeze.

Did you try with Eurobraille display or another one?

burmancomp commented 7 months ago

Did you try with Eurobraille display or another one?

With caiku albatross.

irrah68 commented 7 months ago

I pressed NVDA+Ctrl+R several times without any problems. I'm using Braillex EL 80c Braille display.

Jykke67 commented 7 months ago

I did also press NVDA+Ctrl+R several times without any problems.

CyrilleB79 commented 7 months ago

@Jykke67, w wrote:

I did also press NVDA+Ctrl+R several times without any problems.

With which braille display?

Since the issue seems to be located in the Eurobraille driver, maybe it's the only model that is impacted.

If anybody having an Eurobraille display could tests, it would be helpful.

Jykke67 commented 7 months ago

Sorry, I did forget. My braille display is Braille Star 80.