nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.12k stars 638 forks source link

Continuous reading problem on Sapi 5 synthesizers still unsolved #16691

Open retrochiptuner opened 5 months ago

retrochiptuner commented 5 months ago

Since the release of NVDA 2019.3, there has been a problem that many synthesizers did not work correctly with continuous reading, stopping speech at the end of a sentence. I have observed that eventually this problem has been corrected: As for Sapi 5, like some version of Ivona, and other synthesizers in developed plugins, so I understand that something must have been done, but then I suggest that the driver be implemented or rewritten to Sapi 5 so that you can read continuously without problems as was the case until v 2019.2.

Test cases.

tested sapi 5 synthesizers with which this problem occurs.

seanbudd commented 4 months ago

Hi, It is likely these are design issues with the syntehsizers not correctly following the SAPI 5 spec.

Loquendo 6 (Loquendo 7 works poorly)

See issue: https://github.com/nvaccess/nvda/issues/10665, this is a design flaw

AT&t natural Voices 1.4.

This was last updated 2010, it is unlikely to be fixed

Verbio TTS V8.10.

It seems like there is a newer version of Verbio, can you confirm it works?

The new Sapi 5 adapter for Microsoft Azure online voices

It is likely this is not designed in line with the Spec, it would be nice to find out from the author more information on what is going wrong here

Infovox 4 demo.

This has also been discontinued.

gexgd0419 commented 2 months ago

NVDA relies on the bookmark notifications sent by the SAPI5 voice engines to know the current speaking progress. If the SAPI 5 voice doesn't implement this and sends no bookmark notification, NVDA will just wait forever.

Bookmarks are XML tags <bookmark mark="xxx"/> inserted into the text to be spoken. When the speaking progress reaches a bookmark, an notification will be sent to the client (NVDA). However, there's another kind of notification called Word Boundary that will be sent to the client when each word is being spoken, so some SAPI5 voices decided to only support Word Boundary events and not bookmark events.

One of the solutions, of course, is to just wait for an update of those SAPI5 voices that implements the bookmark feature.

But theoretically, NVDA can also choose to get the current speaking progress from Word Boundary events. Most SAPI5 voices should support Word Boundary events, so using Word Boundary events instead of bookmarks might solve this problem for some voices. (You can check if a SAPI5 voice supports Word Boundary events by using the Speech Properties dialog in the Control Panel, and see if the voice can highlight the words during preview)

Even if the voice only supports the bare minimum feature, without even the Word Boundary events, NVDA can still get the End Stream notification, so it will know the whole text has been completed and can choose to stop waiting for the non-existent bookmark event.

The new Sapi 5 adapter for Microsoft Azure online voices

This has been fixed in the v0.2 release.

It only happened for online Edge voices, because the Edge voice server supports word boundaries but not bookmarks. So in v0.2, the adapter uses the received word boundary information to create the missing bookmark event notifications for Edge voices.