nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.1k stars 634 forks source link

NVDA's late response speech in letter navigation #10971

Closed SeanTolstoyevski closed 1 year ago

SeanTolstoyevski commented 4 years ago

Hello, I have been following NVDA's transition to python3 for a long time. Meanwhile, I encountered a problem. I do not know what kind of analysis I will present on this topic. But with the same settings, NVDA 2019.2 reacts faster. When I move around the screen by pressing any letter, NVDA waits too long to speak. I think there is an estimated 200 ms delay.

Steps to reproduce:

For example, let's imagine that we are navigating on the desktop

  1. Windows+M = Folder View, or another folder

Our sample items are:

Actual behavior:

  1. Let's navigate between the items here by pressing the first letters of the items. Example: press g: google chrome, after press: github desktop

Result: Compared to the python 2 version of NVDA, 2019.3 seems to have a late response in letter navigation.

Expected behavior:

System configuration

NVDA installed/portable/running from source:

NVDA version:

Problem: 2019.3

Windows version:

windows 1903, 64bit, 18358.1

Name and version of other software in use when reproducing the issue:

IBMTTS: IBMTTS driver; Status: Enabled; Version: 19.8B4; Author: David CM dhf360@gmail.com and others I tested it with other synthesizers. No problem is accuring.

Other information about your system:

There is an SSD. No problem from CPU, RAM and SSD.

Other questions

Does the issue still occur after restarting your computer?

yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

2019.2: non-problem. speed response

If addons are disabled, is your problem still occuring?

yes

Did you try to run the COM registry fixing tool in NVDA menu / tools?

yes

Notes!

There may be places I am confused about English. My main language is not English. Please pay attention to plain writing when answering me. Sometimes I can't understand you. Because technical problem.

I can write other information if you want. Maybe there may be something I forgot.

And this problem is very frustrating. I hope we can find a way to fix it.

lukaszgo1 commented 4 years ago

To be absolutely clear this delay occurs only when you are using IBMTTS as your speech synthesizer, right? If so this should be closed as invalid and reported against https://github.com/davidacm/NVDA-IBMTTS-Driver/ Also you've stated that this is occurring when add-ons are disabled and the IBM TTS driver is an add-on so these information's are contradictory.

SeanTolstoyevski commented 4 years ago

hi @lukaszgo1 no, The reason for the delay is not IBBMTTS. IBMTTS is working properly. This problem exists only in Espeak.

The plugin example there was to report that there was no such problem with other synthesizers.

Do not closeIssue.

Adriani90 commented 4 years ago

I cannot reproduce this with NVDA 2019.3.1 on Windows 10 1909. Could you please uninstall NVDA, delete the configuration folder in %appdata% and install NVDA again?

SeanTolstoyevski commented 4 years ago

hi @Adriani90 ,

I thought about this situation. I also tested it in alpha versions where no addons was installed. The problem persists.

I just launched NVDA 2019.2 from the portable version. It definitely responds faster.

I recommend you to run the 2019.2 version to see the difference.

I do not have enough technical knowledge about the problem. If you wish, we can try something like Remote connection.

SeanTolstoyevski commented 4 years ago

In order to be able to compare, I will sound record the navigation of the 2019.3 version and the 2019.2 version where I had the problem.

I will upload it to Google Drive.

We can analyze better by listening.

SeanTolstoyevski commented 4 years ago

hi @Adriani90 and @lukaszgo1 ,

Record link: https://drive.google.com/open?id=1fsXHHsU4voqo9NJ1H7KR_NxqmhXFw6hO

The file is ready.

Speaking at the beginning, NVDA version 2019.3.1. Version 2019.3.1: Started: 00:04 End: 00:27 I'm navigating the desktop.

Then I run the portable 2019.2 version. Version 2019.2: Start: 00:29 End: 00:51 I'm navigating the desktop.

I started the 2019.3.1 version installed on the system again so that it can be better understood.

I hope you could understand the difference.

josephsl commented 4 years ago

Hi, can you try using 2019.3.1 as a portable copy for sake of completeness? Thanks.

SeanTolstoyevski commented 4 years ago

hi @josephsl

Portable, source or installed. They all have the same delay.

There is no delay in 2019.2.

josephsl commented 4 years ago

Hi, I see. Is 2020.1 beta also affected when it comes to speech and braille delay? Thanks.

SeanTolstoyevski commented 4 years ago

hi @josephsl

INFO - main (19:14:15.853) - MainThread (5728): Starting NVDA version 2020.1beta1

Yes, I tested the beta. No braille screen. No test.

Delay in letter navigation is evident.

Did you compare 2019.2 with 2019.3 in the record I sent? It's 100-200 milliseconds between the moment I press the letter and NVDA saying the new item.

I'm not sure about the numbers. But I'm sure it's more than 2019.2.

josephsl commented 4 years ago

Hi, I’m reviewing it. CC @LeonardDer in case he has any insights.

josephsl commented 4 years ago

Hi,

Hmmm, you say you are using build 18358? Unless that's a typo, that's NOT Version 1903 - the actual build is 18362.

Thanks.

josephsl commented 4 years ago

Hi,

I'm afraid I cannot provide a definitive answer as to which NVDA version responds better: settings are different. In order to provide a more fair comparison, ALL NVDA versions you are testing must have same settings. Since they do not (screen curtain on/off, different voice rate), coupled with inconsistent statements about add-ons, I'm afraid I cannot help you much.

I know that my statement can sound really bad at this time. One of the things developers look for when looking at possible regressions and bugs is a reliable way to reproduce them, and one way to do this is same settings across different NVDA versions (all of them using portable copy, all of them using same speech synthesizer, all of them with no screen curtain on, all of them using same voice rate, all of them with no add-ons or add-ons disabled). Once that's done, then we can compare responsiveness of NVDA to see if this is something to do with NVDA, Python version, synthesizer code, or something else.

In more technical terms: essentially, what I'm trying to say is eliminating possible variables that can interfere with our investigation. In the past, it was noted that an issue had to do with an add-on, or caused by different settings. Only after all NVDA versions under testing are identical in terms of user environment (settings, add-ons, dictionary entries, etc.) then we can start asking ourselves if this is truly something to do with Python upgrade.

Another thing to note: NVDA 2019.3 introduced a whole bunch of changes regarding speech processing. If this issue is strictly due to Python upgrade, other parts of NVDA MUST stay the same (just saying "similar" doesn't help). Since NVDA 2019.3 introduced Python 3 and speech refactor, one or both, or something included in 2019.3 and later, or something entirely different are to blame (I believe). This is the reason for my explanation above regarding eliminating things that can interfere with this bug hunt.

Hope this helps. Thanks.

DrSooom commented 4 years ago

@SeanTolstoyevski: Please open the Windows-own Notepad and type some text. Navigate here letter by letter with the arrow keys. Any new delay here?

SeanTolstoyevski commented 4 years ago

hi @DrSooom

No, there is no delay when I scroll through the notepad with the arrow keys. This problem is only seen in letter navigation.

hi again @josephsl

I will apply what you say and add the .ini file. This will take a little longer. I will use the options you mentioned. I understand you. You don't want to reach a wrong conclusion. Please wait for me.

SeanTolstoyevski commented 4 years ago

Hi @josephsl & everyone

Made with default NVDA settings. The only version without delay: 2019.2 There is no plugin in any test.

The problem is not related to my version of Windows. On virtual machines, I also tried it with the latest version 1909. I also get the same result in another leptop.

The only way to understand this is to quickly switch between the two versions. we cannot realize this situation. Because our ear is getting used to.

josephsl commented 4 years ago

Hi,

Our next step will involve a bit of work to set up, but in the end will result in a debug log that tells us how long it takes for NVDA to process input. For each version of NVDA, do:

  1. Open NVDA menu (NVDA+N), go to Preferences, then Settings.
  2. Go to Advanced category.
  3. Check "I understand that changing these settings may cause NVDA to function incorrectly" checkbox.
  4. From debug logging section, from "Enabled logging categories" list, check "timeSinceInput" checkbox.
  5. Click OK.

Once all NvDA versions you have are set up like this, for each version, do:

  1. Start NVDA.
  2. Restart with debug logging (NVDA+Q for exit dialog, then select "restart with debug logging enabled").
  3. Perform keyboard input scenarios.
  4. Press NVDA+F1 to open log viewer and copy and paste everything as an attachment.
  5. Exit NVDA and move onto the next version.

You must do the same thing with each NVDA version, including keyboard input scenarios. That way we can easily compare things.

Thanks.

SeanTolstoyevski commented 4 years ago

Hi @josephsl & everyone,

These are the test results according to the steps Joseph said. I have no technical knowledge to understand these results. But I see that the NVDAs created with python 3 have higher delay numbers. Hopefully it benefits your business.

josephsl commented 4 years ago

Hi, I’ll take a look at the log in detail shortly, but first, I would like to strongly urge you to upgrade to build 18363 (again 18358 is NOT the build to be used, as that was a release candidate, not the official 1903/1909 build). As such, the build may not be as secure as newer ones as no patches were released for it. You can upgrade by using Windows 10 Upgrade Assistant for Version 1909. Thanks.

SeanTolstoyevski commented 4 years ago

I will not update Windows unless I have a big problem.

I don't think this issue has anything to do with the version of windows I use.

The same problem is seen on other machines and other versions of Windows 10.

Thank you for your suggestion, Joseph

josephsl commented 4 years ago

Hi, actually, with the build you’ve got, it IS a big problem: no security patch whatsoever, but build 18362 and later do have patches. Also, based on logs I have, I can see a small delay from 2019.3 onwards but not significant enough, as the delay can be attributed to internal things NVDA does when it receives input and generates speech utterances (asking Espeak to generate wave forms). If the difference is say, more than 100 milliseconds, then that’s a possible cause for concern. As I noted earlier, what makes it hard to pinpoint the issue is that 2019.3 includes Python 3.7 and speech refactor, making it harder to locate what’s up. Finally, you can’t use Windows OneCore with your computer because it can’t deal with an exception (according to the log, there is a Windows error causing OneCore to fail to load). Thanks.

SeanTolstoyevski commented 4 years ago

I removed oneCore.

The difference is 20 milliseconds, but it does make the difference.

You won't do anything about it, right?

if you are not going to do anything, you can close issue.

josephsl commented 4 years ago

Hi, I’m leaning towards closing this next week unless others say otherwise. Thanks.

SeanTolstoyevski commented 4 years ago

Hi @josephsl

There is one thing I forgot to say:

but not significant enough, as the delay can be attributed to internal things NVDA does when it receives input and generates speech utterances (asking Espeak to generate wave forms).

There have been no major changes in Espeak since the 2019.2 release. I don't think the reason for the delay is due to Espeak. I can install Espeak in 2019.2 version on NVDA with python 3 version. I don't think the delay will change.

I want to add something else. This delay is only seen in navigation. There is no delay caused by Espeak's process of producing sound waves. Because on web pages, folders, etc. no delay.

josephsl commented 4 years ago

Hi, this indicates something other than the synthesizer might be at fault. This may require deeper analysis such as customized builds and debug messages, something I don’t think we will be able to do without significant changes to how NVDA is built, namely moving back to Python 2.7 (by the way, Python 2.7 is now end of life). Thanks.

Adriani90 commented 4 years ago

@SeanTolstoyevski thanks for the information. To be sure that we understand correctly your issue. Is the delay significant between the key press and the reporting of the key pressed? (i.e. g)? Or is it the delay between the reporting of key pressed (for example g) and the name of the desktop element to which you navigate (for example Google crome)? If you say it works correctly with other synthesizers, then I think this is an eSPeak issue but the difference in delay between 2019.3 and 2019.2 is really low. Could you test this on another machine with Windows 10 newer version (9.e. 1909)?

I think in 2019.3.1 there was some new stuff regarding pauses when using rateboost on. Could you please turn off rate boost and see if the issue is still occuring?

SeanTolstoyevski commented 4 years ago

Hi,

I am talking about the difference between the moment the letter is pressed and the saying of the new item. No delays for letters or large text.

I disagree with your opinion on Espeak. For example, texts consisting of 5000 characters are spoken without delay. Also, there is no major update on Espeak. There were no major updates between Espeak in 2019.2 and Espeak used in python 3. I follow the Espeak project.

I think the problem is the moment NVDA sends data to Espeak.

I've written before. I tried this situation on virtual machines, various computers and different versions of Windows. It has nothing to do with my Windows configuration and version of Windows. My sound card is also OK. Delay is felt in all of them.

It may seem that there is no difference between 20 and 40 milliseconds mathematically. However, this is very difficult for the ears who are used to fast response. A few of my friends did not move to 2019.3 for this reason.

2019.2 = Average 29 ms 2019.3 = average 70 ms

SeanTolstoyevski commented 4 years ago

Currently I have replaced Espeak files in 2019.3 with those in 2019.2. The delay continues.

josephsl commented 4 years ago

Hi,

For now, I'm closing this issue, as there is really nothing we can do apart from going back to Python 2.7 and rebuilding NvDA from source to find out what happened since then. Making matter more complicated is that we have also moved onto Visual Studio 2019 and other dependencies have changed, so our options are limited at this point.

Thanks, and sorry for the inconvenience.

tspivey commented 4 years ago

This issue is very clearly reproducible, so shouldn't be closed. It's most likely caused by speech refactor. How do I know this? Because 2019.3 is noticeably slower than 2019.2. The description wasn't 100% clear, but listen to the recording and reproduce the example.

  1. Start NVDA 2019.2.1 with eSpeak, turn on character echo, find something on your desktop where you can quickly jump between the items with the same first letter, and press it.
  2. Do the same with 2019.3. It's noticeably slower between the time it reads the letter you typed and the item moved to.

My guess is that NVDA now has to wait until the speech is 100% finished and the index received before sending the next string on to be spoken. As a result, some very common operations are noticeably slower, including:

  1. Typing with character echo and moving.
  2. Switching applications and hearing the various pieces spoken, for example Untitled - Notepad, text editor edit multi line, blank.
josephsl commented 4 years ago

Hi,

Ah, that explains it... If it affects all synthesizers, then yes, it might be time to ask @MichaelDCurran for advice.

Thanks.

SeanTolstoyevski commented 4 years ago

hi @tspivey ,

It's most likely caused by speech refactor. How do I know this? Because 2019.3 is noticeably slower than 2019.2.

I absolutely agree with you.

I do not know. I do disagree it right to close this issue. I think screen readers should react as quickly as possible. I will continue to use 2019.2.

Good luck nvda team.

feerrenrut commented 4 years ago

First of all, thank you to everyone for putting so much time into trying to get to the bottom of this. To be clear, this process isn't doubting that the problem happens on your machine, but trying to narrow down the cause so we can fix it, or at least understand it.

I have noticed some differences in behavior with audio devices since 2019.3. Can we rule that out as the cause @SeanTolstoyevski? Please could you test with different audio devices? Perhaps with USB headphones if you have any, or with bluetooth headphones, compare that to the sound card in the machine. Please try this with and without some other music playing in the background. Background music will ensure that windows is not putting the device to sleep for any reason.

Make sure that you don't have Windows "power saving" or "battery saving" mode enabled, set your computer to "best performance".

I have tried the steps outlined by @tspivey but don't notice any difference in timing on my machine, I assumed that character echo meant "speak typed characters", please correct me if you meant something else.

I listened to the audio samples recorded, but there are some problems:

@SeanTolstoyevski If you have time for it, please record your demo again, using exactly the same configuration for both versions of NVDA and with a clear sound at the moment of the key press?

As already discussed, the "time since input" is not showing an obvious problem. It is worth noting that this is just the time from when the NVDA gets the key down event until when it speaks. The keydown event from windows comes with an event time (technical details to follow) it might be worth seeing if the time elapsed can be calculated to rule out any delay in NVDA getting this keydown message. It may be that changes in pythons thread scheduling have impacted the priority of receiving these messages from the OS? If interested in investigating this line of thinking further see kbd=KBDLLHOOKSTRUCT.from_address(lParam) in winInputHook.keyboardHook. There is a time member on KBDLLHOOKSTRUCT docs which is in milliseconds and more information in the docs for GetMessageTime. I would suggest hacking builds of 2019.2.1 and 2020.1 to save this timestamp in a variable in winInputHook and then in inputCore.InputManager.executeGesture where we set self._lastInputTime call getMessageTime and subtract the first from the second to see how long since the message was created. There is a note in the getMessageTime docs that indicates there may be problems doing this, it's not clear that this warning still applies if there are significant delays (EG many milliseconds).

SeanTolstoyevski commented 4 years ago

Hello, In two leptopes, I tested in Windows 1909, 1903, with headphones and without headphones, nothing about performance settings & best performance settings, even in settings where Defender wouldn't affect NVDA, i.e. in closed settings, playing music in the background and playing music. Sometimes on virtual machines. I don't want to tire any of you with blank information. For this reason, in order to understand this situation, I conducted my tests by switching between the two versions many times.

I want to remind you. In the issues you mentioned, if there were problems, I could understand this situation in 2019.2. But no problem.

My English is not that good.

Let me rewrite what I want to tell: Sample folder is desktop. I press a letter to focus on a new item on the desktop. The difference between the moment I press the letter and the say of the item is significantly more than 2019.2.

So the delay is not the speaking of the letter. It is the difference between the letter and the say of the item. It is 20-30 milliseconds on average. And more = 60-70 ms.

I will record a new voice.

But you apply them too.

Switch between 2019.2 and 2019.3.

tspivey commented 4 years ago

@feerrenrut Please try with OneCore. The difference is much more obvious there. Specifically, from the desktop, press a letter and listen for the delay between when the letter finishes speaking and the icon name. With OneCore at rate 40 with rate boost on, it's easy to notice.

DrSooom commented 4 years ago

@SeanTolstoyevski and @tspivey: Please disable the options

in the Object Presentation NVDA Settings and report the result. Thanks.

SeanTolstoyevski commented 4 years ago

I have ini files link above. these options are disabled.

I'm not stupid. I close this issue. you don't read what I wrote.

LeonarddeR commented 4 years ago

With respect, but please keep kind to each other. @SeanTolstoyevski as your issue is reproducible by others, I"m reopening it. I will also try to minimize unneeded comments a bit so it will be easier to follow the discussion.

feerrenrut commented 4 years ago

@feerrenrut Please try with OneCore. The difference is much more obvious there. Specifically, from the desktop, press a letter and listen for the delay between when the letter finishes speaking and the icon name. With OneCore at rate 40 with rate boost on, it's easy to notice.

@tspivey I tried this with the settings you suggest:

It's known that OneCore has quite a large delay between sentences / utterances compared to espeak. However after testing with the same settings between 2013.3 and 2019.2, I do note that one core pauses have gotten larger. While I don't notice the same issue with espeak, I will have to make recordings and measure the time to be sure. I suspect this is just as likely to be caused by python 3 as it is by speech refactor, there are likely changes to how python schedules and interfaces with external dlls.

@tspivey does your issue #10721 relate to this issue (time between key announcement and icon name announcement)? If so #10721 should be updated with these instructions.

However, this was not my interpretation of the issue being described here, it would be great to confirm if this is what @SeanTolstoyevski is referring to. @SeanTolstoyevski what is your native language? Perhaps someone can help to translate?

SeanTolstoyevski commented 4 years ago

The thing I'm talking about is the same as @tspivey.

so this:

This issue is very clearly reproducible, so shouldn't be closed. It's most likely caused by speech refactor. How do I know this? Because 2019.3 is noticeably slower than 2019.2. The description wasn't 100% clear, but listen to the recording and reproduce the example.

  1. Start NVDA 2019.2.1 with eSpeak, turn on character echo, find something on your desktop where you can quickly jump between the items with the same first letter, and press it.
  2. Do the same with 2019.3. It's noticeably slower between the time it reads the letter you typed and the item moved to. My guess is that NVDA now has to wait until the speech is 100% finished and the index received before sending the next string on to be spoken. As a result, some very common operations are noticeably slower, including:
  3. Typing with character echo and moving.
  4. Switching applications and hearing the various pieces spoken, for example Untitled - Notepad, text editor edit multi line, blank.
feerrenrut commented 4 years ago

Thanks @SeanTolstoyevski

SeanTolstoyevski commented 4 years ago

no,

Although similar, this situation is different from that issue.

only the delay in navigation is mentioned here. In that issue, the user does not navigate letters. navigates with various keys combinations. onecore is used.

I'm talking about espeak here. but the situation was also observed in onecore. I am not using onecore. I do not know.

and I want to ask a question. are you going to do something about this issue?

Adriani90 commented 4 years ago

@SeanTolstoyevski we are trying our best. Issues like this are not very trivial. and investigation usually means also that we might have many questions. And we also need regular updates from the people who created issues. otherwise we tap in darkness. Let's see if it gets better after #11024. #11023 has already been merged and these two pull requests together might already have a positive impact on this.

Adriani90 commented 4 years ago

One proposal from my side, could you please run the COM registration fixing tool in NVDA menu under tools?

Using eSpeak with NVDA Version: alpha-20005,14c2e2ec on Windows 10 1909 on a Dell laptop, I cannot reproduce this issue. The time between the key press and the reporting of the element in my case is less than 0.1 second as follows:

IO - inputCore.InputManager.executeGesture (20:15:56.998) - winInputHook (9612):
Input: kb(laptop):f
IO - inputCore.logTimeSinceInput (20:15:57.067) - MainThread (13100):
0.069 sec since input
IO - speech.speak (20:15:57.069) - MainThread (13100):
Speaking ['Firefox', '2 von 5']
IO - inputCore.InputManager.executeGesture (20:15:57.217) - winInputHook (9612):
Input: kb(laptop):f
IO - inputCore.logTimeSinceInput (20:15:57.287) - MainThread (13100):
0.070 sec since input
IO - speech.speak (20:15:57.287) - MainThread (13100):
Speaking ['Firefox', '5 von 5']
IO - inputCore.InputManager.executeGesture (20:15:57.422) - winInputHook (9612):
Input: kb(laptop):f
IO - inputCore.logTimeSinceInput (20:15:57.487) - MainThread (13100):
0.065 sec since input
IO - speech.speak (20:15:57.487) - MainThread (13100):
Speaking ['Firefox', '2 von 5']
SeanTolstoyevski commented 4 years ago

hey @Adriani90 OK. Isn't the difference much?

This is the log of 2019.2. There is more than 20 ms difference. I `m talking about this.

IO - inputCore.InputManager.executeGesture (11:43:48.082):
Input: kb(desktop):g
IO - inputCore.logTimeSinceInput (11:43:48.108):
0.025 sec since input
IO - speech._speakSpellingGen (11:43:48.108):
Speaking character u'g'
DEBUG - queueHandler.registerGeneratorObject (11:43:48.108):
Adding generator 2
IO - inputCore.logTimeSinceInput (11:43:48.108):
0.025 sec since input
IO - speech.speak (11:43:48.108):
Speaking [LangChangeCommand ('en_GB'), u'Google Chrome  10 of 31']
DEBUG - queueHandler.pumpAll (11:43:48.124):
generator 2 finished
IO - inputCore.InputManager.executeGesture (11:43:49.479):
Input: kb(desktop):f
IO - inputCore.logTimeSinceInput (11:43:49.509):
0.031 sec since input
IO - speech._speakSpellingGen (11:43:49.509):
Speaking character u'f'
DEBUG - queueHandler.registerGeneratorObject (11:43:49.509):
Adding generator 3
IO - inputCore.logTimeSinceInput (11:43:49.523):
0.045 sec since input
IO - speech.speak (11:43:49.523):
Speaking [LangChangeCommand ('en_GB'), u'Firefox  5 of 31']
DEBUG - queueHandler.pumpAll (11:43:49.530):
generator 3 finished
IO - inputCore.InputManager.executeGesture (11:43:51.859):
Input: kb(desktop):t
IO - inputCore.logTimeSinceInput (11:43:51.875):
0.016 sec since input
IO - speech._speakSpellingGen (11:43:51.875):
Speaking character u't'
DEBUG - queueHandler.registerGeneratorObject (11:43:51.875):
Adding generator 4
IO - inputCore.logTimeSinceInput (11:43:51.891):
0.031 sec since input
IO - speech.speak (11:43:51.891):
Speaking [LangChangeCommand ('en_GB'), u'TWBlue  12 of 31']
DEBUG - queueHandler.pumpAll (11:43:51.891):
generator 4 finished
IO - inputCore.InputManager.executeGesture (11:43:52.861):
Input: kb(desktop):t
IO - inputCore.logTimeSinceInput (11:43:52.878):
0.020 sec since input
IO - speech._speakSpellingGen (11:43:52.878):
Speaking character u't'
DEBUG - queueHandler.registerGeneratorObject (11:43:52.878):
Adding generator 5
DEBUG - queueHandler.pumpAll (11:43:52.878):
generator 5 finished

The average of the log you send is 70 ms. 30 ms in 2019.2.

SeanTolstoyevski commented 4 years ago

Yes, I run COM registration fixing tool. :) .

jcsteh commented 4 years ago

My guess is that NVDA now has to wait until the speech is 100% finished and the index received before sending the next string on to be spoken.

For reference, I originally thought this might be the cause of problems too. However, I subsequently realised that both OneCore and eSpeak don't allow subsequent utterances to be sent to the synth until after the current utterance has finished speaking. That is, we have to queue the utterance within the driver. So, whether we send multiple utterances to the driver or not shouldn't matter; the driver can only send one at a time to the synth regardless.

SeanTolstoyevski commented 4 years ago

Hi all,

I have very little idea about NVDA's connection with Espeak. But I think the following.

  1. When I think the delay is due to Espeak, this delay should be everywhere.  For example, when pressing letters in notepad. But it's fast.
  2. If I think there is a problem with Python, threading, or other problems, other systems that call these processes a lot should work with delay. But they work very fast.  For example, addons.  The only thing that comes to my mind is that NVDA is calling DLLs with ctypes. It can delay calls.  I think there is a problem with one of the modules that are only about Espeak.  An overworking loop or long if else line.  This is how I can explain that other synth drivers are not affected by this situation.
jcsteh commented 4 years ago

So, whether we send multiple utterances to the driver or not shouldn't matter; the driver can only send one at a time to the synth regardless.

On further reflection, there's an additional source of delay: when the synth notifies NVDA that it has reached an index (which happens on a background thread), that notification is dispatched to the main thread using queueHandler.queueFunction. That uses core.requestPump to request a core pump. If a core pump hasn't already been requested, that will deliberately be delayed for at least 10 ms. In practice, it's more like 20 ms. We should probably dispatch index notifications using wx.CallAfter or give queueHandler.queueFunction/core.requestPump an argument to force an immediate pump without the delay. Here's a hacky patch for testing with CallAfter, but I think we probably want to do the requestPump tweak instead, since otherwise, we don't behave properly with respect to watchdog, etc.

diff --git a/source/speech/manager.py b/source/speech/manager.py
index 59dd75b8f..9c5a07a43 100644
--- a/source/speech/manager.py
+++ b/source/speech/manager.py
@@ -328,7 +328,8 @@ class SpeechManager(object):
                if synth != getSynth():
                        return
                # This needs to be handled in the main thread.
-               queueHandler.queueFunction(queueHandler.eventQueue, self._handleIndex, index)
+               import wx
+               wx.CallAfter(self._handleIndex, index)

        def _removeCompletedFromQueue(self, index):
                """Removes completed speech sequences from the queue.

Even with this change, there's some additional overhead because there are still a few extra cross-thread round trips (index dispatched from background to main thread, then utterance gets queued from main thread to background thread).

The only way to avoid that cross-thread overhead would be to queue one utterance ahead at all times. However, that might be tricky to manage in terms of SpeechManager's state, and SpeechManager is complex enough as it is. It's also possibly a micro-optimisation that won't help in an observable way, so I'd think the requestPump stuff would be a better place to start. 20 ms is definitely something a user can feel.

Adriani90 commented 4 years ago

@jcsteh I agree with your last comment, especially for users using slower machines the delay can be even quite painful.