Continuous reading problem on Sapi 5 synthesizers still unsolved

retrochiptuner commented 5 months ago

Since the release of NVDA 2019.3, there has been a problem that many synthesizers did not work correctly with continuous reading, stopping speech at the end of a sentence. I have observed that eventually this problem has been corrected: As for Sapi 5, like some version of Ivona, and other synthesizers in developed plugins, so I understand that something must have been done, but then I suggest that the driver be implemented or rewritten to Sapi 5 so that you can read continuously without problems as was the case until v 2019.2.

Test cases.

tested sapi 5 synthesizers with which this problem occurs.

Loquendo 6 (Loquendo 7 works poorly) - #10665
AT&t natural Voices 1.4. (last updated 2010)
Verbio TTS V8.10.
The new Sapi 5 adapter for Microsoft Azure online voices
Infovox 4 demo. The issue occurs when for some reason the bell sound plays and stops all speech instead of continuing as up to NVDA version 2019.2.1.
Tested synths that work well on.
Scansoft (Isabel's voice only)
Ivona2 V1.6.63, although despite continuing the continuous reading it still skips some phrases.
cepstral 5.
Microsoft (the voices added to Windows 10/11)
Around:

Operating system: Windows 10 64 Bit and NVDA 2024.1.

Log:

I haven't found anything relevant when reading the log after doing a test where the continuous reading with one of these systemizers stops after a sentence, but I'm going to copy one: INFO - main (16:31:09.987) - MainThread (1476): Starting NVDA version 2024.1 INFO - core.main (16:31:10.404) - MainThread (1476): Config dir: C:\Users\fermi\AppData\Roaming\nvda INFO - config.ConfigManager._loadConfig (16:31:10.404) - MainThread (1476): Loading config: C:\Users\fermi\AppData\Roaming\nvda\nvda.ini INFO - core.main (16:31:13.716) - MainThread (1476): Windows version: Windows 10 22H2 (10.0.19045) workstation AMD64 INFO - core.main (16:31:13.716) - MainThread (1476): Using Python version 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:40:55) [MSC v.1935 32 bit (Intel)] INFO - core.main (16:31:13.716) - MainThread (1476): Using comtypes version 1.2.0 INFO - core.main (16:31:13.716) - MainThread (1476): Using configobj version 5.1.0 with validate version 1.0.1 INFO - synthDriverHandler.setSynth (16:31:16.553) - MainThread (1476): Loaded synthDriver eloquence WARNING - mathPres.initialize (16:31:16.584) - MainThread (1476): MathPlayer 4 not available INFO - core.main (16:31:17.266) - MainThread (1476): Using wx version 4.2.2a1 msw (phoenix) wxWidgets 3.2.4 with six version 1.16.0 INFO - brailleInput.initialize (16:31:17.276) - MainThread (1476): Braille input initialized INFO - braille.initialize (16:31:17.276) - MainThread (1476): Using liblouis version 3.28.0 INFO - braille.initialize (16:31:17.276) - MainThread (1476): Using pySerial version 3.5 INFO - braille.BrailleHandler._setDisplay (16:31:17.286) - MainThread (1476): Loaded braille display driver 'noBraille', current display has 0 cells. INFO - core.main (16:31:18.164) - MainThread (1476): Java Access Bridge support initialized INFO - UIAHandler.UIAHandler.MTAThreadFunc (16:31:18.314) - UIAHandler.UIAHandler.MTAThread (15408): UIAutomation: IUIAutomation6 INFO - external:sonata_neural_voices.aio._thread_target (16:31:24.444) - piper4nvda_asyncio (12292): Starting asyncio event loop INFO - core.main (16:31:26.014) - MainThread (1476): NVDA initialized INFO - external:globalPlugins.TranslateAdvanced.inicio (16:31:26.109) - Thread-6 (inicio) (11516): Traductor Avanzado iniciado correctamente. INFO - watchdog.waitForFreezeRecovery (16:31:28.090) - watchdog (22548): Starting freeze recovery after 0.5002905998844653 seconds. INFO - watchdog.waitForFreezeRecovery (16:31:28.545) - watchdog (22548): Recovered from freeze after 0.955484899925068 seconds. INFO - external:synthDrivers.sonata_neural_voices.aio._thread_target (16:31:36.095) - piper4nvda_asyncio (7276): Starting asyncio event loop INFO - synthDriverHandler.setSynth (16:31:38.936) - MainThread (1476): Loaded synthDriver sapi5 INFO - synthDriverHandler.setSynth (16:32:08.617) - MainThread (1476): Loaded synthDriver eloquence ERROR - unhandled exception (16:32:08.783) - MainThread (1476): Traceback (most recent call last): File "wx\core.pyc", line 3427, in File "gui\settingsDialogs.pyc", line 1475, in refreshGui RuntimeError: wrapped C/C++ object of type BoxSizer has been deleted

seanbudd commented 4 months ago

Hi, It is likely these are design issues with the syntehsizers not correctly following the SAPI 5 spec.

Loquendo 6 (Loquendo 7 works poorly)

See issue: https://github.com/nvaccess/nvda/issues/10665, this is a design flaw

AT&t natural Voices 1.4.

This was last updated 2010, it is unlikely to be fixed

Verbio TTS V8.10.

It seems like there is a newer version of Verbio, can you confirm it works?

The new Sapi 5 adapter for Microsoft Azure online voices

It is likely this is not designed in line with the Spec, it would be nice to find out from the author more information on what is going wrong here

Infovox 4 demo.

This has also been discontinued.

gexgd0419 commented 2 months ago

NVDA relies on the bookmark notifications sent by the SAPI5 voice engines to know the current speaking progress. If the SAPI 5 voice doesn't implement this and sends no bookmark notification, NVDA will just wait forever.

Bookmarks are XML tags <bookmark mark="xxx"/> inserted into the text to be spoken. When the speaking progress reaches a bookmark, an notification will be sent to the client (NVDA). However, there's another kind of notification called Word Boundary that will be sent to the client when each word is being spoken, so some SAPI5 voices decided to only support Word Boundary events and not bookmark events.

One of the solutions, of course, is to just wait for an update of those SAPI5 voices that implements the bookmark feature.

But theoretically, NVDA can also choose to get the current speaking progress from Word Boundary events. Most SAPI5 voices should support Word Boundary events, so using Word Boundary events instead of bookmarks might solve this problem for some voices. (You can check if a SAPI5 voice supports Word Boundary events by using the Speech Properties dialog in the Control Panel, and see if the voice can highlight the words during preview)

Even if the voice only supports the bare minimum feature, without even the Word Boundary events, NVDA can still get the End Stream notification, so it will know the whole text has been completed and can choose to stop waiting for the non-existent bookmark event.

The new Sapi 5 adapter for Microsoft Azure online voices

This has been fixed in the v0.2 release.

It only happened for online Edge voices, because the Edge voice server supports word boundaries but not bookmarks. So in v0.2, the adapter uses the received word boundary information to create the missing bookmark event notifications for Edge voices.

nvaccess / nvda

Continuous reading problem on Sapi 5 synthesizers still unsolved #16691

Test cases.

tested sapi 5 synthesizers with which this problem occurs.

Tested synths that work well on.

Around:

Log: