Closed manish10 closed 11 months ago
I am struggling to reproduce this. I hear "Best action grouping 1 of 6 bullet split button create a bulletted list ".
though I am running an Office insiders build (beta channel) so it is possible it may have been somehow fixed by MS word?
Additional information: This only happens with the eloquence synthesizer addon I have. The problem goes away if I use espeak or one core voices. I wrote the eloquence addon for the India licensed copy of eloquence. It is probably broken because of this release changes to wasapi etc. Can you give me some pointer on where I can start to debug it?
also apologies for incorrectly answering the addons question in the original bug template...
Closing as this is an add-on bug
Note this snippet might be helpful, are you sure this is related to WASAPI? You can check this by disabling WASAPI in advanced settings
https://github.com/nvaccess/nvda/pull/14900#issuecomment-1659975589
Can you set your NVDA log level to debug, and provide a log while you demonstrate this behavior?
Also, if you turn WASAPI off in Advanced settings of NVDA, does the problem go away?
Although the issue is closed, I will provide additional info here in the hope of getting inputs on how to resolve this:
With some more testing, I found that the missing content does get spoken if I keep waiting for about 12 seconds in all cases. The specific issue is that multiple speak commands get queued up and a callback flushes the queue. for some reason, the flush does not happen any more until 12 seconds later. Nothing has changed in my synth handler code and the exact same code works with the flush happening correctly with the older version of NVDA but not with 2023.3. The issue has already been marked closed but something has changed in the NVDA code to cause this behavior. I could not find anything related to this in the change log for this release.
Will really appreciate any help.
Just to make sure have you restarted NVDA after disabling WASAPI in advanced settings? If this is not caused by WASAPI would you be able to bisect the commit which introduced the regression?
After I restarted NVDA after disabling wasapi in advanced settings, the issue went away. Thanks for this suggestion. All the times I tried this earlier, I must not have restarted after selecting that setting. How can we fix this? Should I reopen this issue to help track this?
Would you be able to test your add-on with the latest Alpha, there were various improvements to WASAPI since 2023.3 was released, so it may got fixed by accident. Note that you would need to override compatibility of your add-on to make this test. Is the code of your add-on available somewhere?
I can make the code available to you. It is not a public repo because the eloquence synth is licensed. I will try and install the alpha build: does this mean just the latest nightly build or is there a specific build you would like me to use?
Actually, I am running NVDA from source and have the latest code from the repo. So, this is not fixed in any alpha build.
Since this is clearly caused by WASAPI cc @jcsteh (let's hope this is the last time someone triaging issues has to CC you due to WASAPI related problems, though this is very likely wishful thinking).
The specific issue is that multiple speak commands get queued up and a callback flushes the queue.
Can you elaborate on this? With WASAPI enabled, the callbacks should always be fired, but you should not assume that the callback for the previous call to feed will necessarily be fired during the very next call to feed. This was always an incorrect assumption - it was never a promise made by the API - but you could get away with it with the old WinMM code due to an implementation detail.
There is a good reason for this change. Previously, if you called feed with two very tiny chunks of audio and callbacks, we had to wait for the first chunk to finish before we could push the second. This caused buffer underruns and thus tiny annoying glitches in the audio output; see #10185 and #11061. With the WASAPI code, we keep pushing audio until the buffer is sufficiently full to avoid underruns/dropouts. Callbacks are fired based on the audio clock rather than delaying the pushing of subsequent chunks.
You can call WavePlayer.sync to wait for all audio to complete, which will fire any pending callbacks. However, that will also mean that if the last chunk of audio was not a complete phrase, you'll get a glitch in your audio output, though you would have with the old code anyway.
Thanks @jcsteh for the above. You are correct in that the callback does get fired but on some clock and not immediately after the previous chunk was spoken. You mention this is by design. In this case, what is the correct fix for this that I should implement in the tts driver?
While I implement the correct fix, for the interim, where should I call WavePlayer.sync?
(was travelling over the holidays and couldn't respond sooner)
You are correct in that the callback does get fired but on some clock and not immediately after the previous chunk was spoken.
That's not quite correct. It does get fired immediately after the associated chunk is spoken. The important point is that it might not necessarily get fired during the very next call to feed() because two calls to feed() might be less than one audio buffer. That is, there might be several feed() calls before callbacks get fired. If there is a callback on the second last feed() call and the final feed() call isn't large enough to satisfy the buffer, the callback won't get called in the final feed() call.
To fix this, ensure you call idle() at the end of each utterance. This will ensure that the buffer is flushed, among other things.
If I call idle() after the feed() call, the original problem is fixed. However, it causes the speech to become very choppy as if the tail end of each syllable is being overwritten by the next speech. Could you guide me a little more here about what the correct sequence should be for calling the feed() method? do all synth drivers call idle right after calling feed() every time? or did I misunderstand your ask above to call idle after every utterance? is that different from calling idle right after feed()?
You should only call idle() at the end of an utterance; i.e. the last chunk generated due to a speak() call.
This fixed the issue but I still get a weird echo at the end of the speech. It is not very noticeable for words or long text. However, it is very noticeable for single letter navigation and for key-echo during typing. Is there a way for me to avoid this echo?
Can you reproduce this with any other synthesiser? With WASAPI, an echo would suggest that the same chunk is being sent more than once, but I don't see how that could be happening unless you're calling feed() twice with the same chunk or unless you're calling idle() before the final chunk of the utterance is sent. Without WASAPI, an echo might be due to a buffer underrun, which happened quite a lot with the non-WASAPI code.
Steps to reproduce:
I am giving below a simple set of steps but the problem is not limited to this one case. I am facing it across all lists in multiple office products, including word and powerpoint.
Actual behavior:
Expected behavior:
NVDA logs, crash dumps and other attachments:
There is no error or crash in the NVDA logs.
System configuration
NVDA installed/portable/running from source:
NVDA 2023.3 installed on windows 11.
NVDA version:
2023.3
Windows version:
windows 11 pro build 22h2 22621.2428
Name and version of other software in use when reproducing the issue:
Microsoft word win 32 latest m365 desktop version.
Other information about your system:
Other questions
Does the issue still occur after restarting your computer?
yes.
Have you tried any other versions of NVDA? If so, please report their behaviors.
this works correctly in 2023.2. I did notice this behavior in one of the 2023.3 beta builds also.
If NVDA add-ons are disabled, is your problem still occurring?
I don't have any add-ons.
Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?
I have not tried this.