improve the responsiveness of onecore voices and sapi voices

king-dahmanus commented 2 years ago

Is your feature request related to a problem? Please describe.

I'm always frostrated when sapi voices and onecore voices are slow and not responsive

Describe the solution you'd like

The voices should be responsive, so they could be mixed with other languages without an undesirable lag: I.e, using some hacks to unify onecore in sapi. Then they could be mixed, like between a latin voice and a non latin voice for optimal reading of both languages. Currently it's unnecessarily slow and unresponsive, which I kindly suggest that you fix

Describe alternatives you've considered

"""Based on advice from a developer who has some experienced with dsp""": Intercept the buffer from memory which has the audio, trim the silence at the beginning with a script which analises the amount of silence and trim it accordingly, then fead it back to the audio device

Additional context

nothing specific. Contact me if I can clarify some more. Please bare in mind that I'm not a programmer, I'm just a simple citizen. Thanks for your great help nv access! I'm sorry to say that I'm unable to monetarely support you. I wish that this project keeps helping blind people around the world like it always did.

cary-rowen commented 2 years ago

Yes, windows Sapi5 is noticeably more responsive on some screen readers, e.g. ZDSR

king-dahmanus commented 2 years ago

Yeah, if the silence could be trimmed, then sapi, or even one core would be as responsive as eloquence or espeak

On Fri, 28 Jan 2022 at 04:15, Rowen @.***> wrote:

Yes, windows Sapi5 is noticeably more responsive on some screen readers, e.g. ZDSR

— Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/13284#issuecomment-1023847116, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJUCFM62JNHCH7XKYD3UYIC6JANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.***>

mzanm commented 2 years ago

I agree, SAPI 5 and one core is somehow crazy fast on ZDSR.

king-dahmanus commented 2 years ago

I don't know what the bleeps they did, but it's possible that they too are trimming the silence from the beginning

On Fri, 28 Jan 2022 at 13:16, Mazen @.***> wrote:

I agree, SAPI 5 and one core is somehow crazy fast on ZDSR.

— Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/13284#issuecomment-1024159756, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJU7SF2TLCXVRNZTBODUYKCJ7ANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.***>

LeonarddeR commented 2 years ago

While I'm an ESpeak user and not using Onecore very frequently, I find OneCore pretty responsive with NVDA. It would be helpful if findings about slow responsiveness are supported by measurable evidence.

king-dahmanus commented 2 years ago

I'm focusing here on sapi5. I mentioned one core because I used a program called sapi unifier to port the one core voices into sapi5

On Sat, 29 Jan 2022 at 13:26, Leonard de Ruijter @.***> wrote:

While I'm an ESpeak user and not using Onecore very frequently, I find OneCore pretty responsive with NVDA. It would be helpful if findings about slow responsiveness are supported by measurable evidence.

— Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/13284#issuecomment-1024902298, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJUZJX2L2FLBYEVWBIDUYPMIJANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.***>

dpy013 commented 2 years ago

This is an audio from anyaubio, listen to it to get an idea of how well zdsr supports the speed of the sapi5 speech synthesizer.

king-dahmanus commented 2 years ago

the link is broken

On Mon, 31 Jan 2022 at 08:56, DPY @.***> wrote:

This is an audio from anyaubio http://anyaudio.net/audiodownload?audio=TWu4HZNSSH0NTk, listen to it to get an idea of how well zdsr supports the speed of the sapi5 speech synthesizer.

— Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/13284#issuecomment-1025465203, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJX6EAF7ME7CRXIISTTUYY6ELANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.***>

dpy013 commented 2 years ago

the link is broken … On Mon, 31 Jan 2022 at 08:56, DPY @.> wrote: This is an audio from anyaubio http://anyaudio.net/audiodownload?audio=TWu4HZNSSH0NTk, listen to it to get an idea of how well zdsr supports the speed of the sapi5 speech synthesizer. — Reply to this email directly, view it on GitHub <#13284 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJX6EAF7ME7CRXIISTTUYY6ELANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.> http://anyaudio.net/listen?audio=TWu4HZNSSH0NTk

Thanks for reminding the above link has been re-edited

king-dahmanus commented 2 years ago

yeah it doesn't take me there for some reason. No matter, lets concentrate on nvda, cause this is what we're working with right?

On Mon, 31 Jan 2022 at 14:38, DPY @.***> wrote:

the link is broken … <#m-7055658538261792368> On Mon, 31 Jan 2022 at 08:56, DPY @.> wrote: This is an audio from anyaubio http://anyaudio.net/audiodownload?audio=TWu4HZNSSH0NTk http://anyaudio.net/audiodownload?audio=TWu4HZNSSH0NTk, listen to it to get an idea of how well zdsr supports the speed of the sapi5 speech synthesizer. — Reply to this email directly, view it on GitHub <#13284 (comment) https://github.com/nvaccess/nvda/issues/13284#issuecomment-1025465203>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJX6EAF7ME7CRXIISTTUYY6ELANCNFSM5M7BHI6A https://github.com/notifications/unsubscribe-auth/AT2FKJX6EAF7ME7CRXIISTTUYY6ELANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.>

Thanks for reminding the above link has been re-edited

— Reply to this email directly, view it on GitHub https://github.com/nvaccess/nvda/issues/13284#issuecomment-1025748464, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJQZM5XU2TGEONDY6EDUY2GFBANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.***>

dpy013 commented 2 years ago

yeah it doesn't take me there for some reason. No matter, lets concentrate on nvda, cause this is what we're working with right? … On Mon, 31 Jan 2022 at 14:38, DPY @.> wrote: the link is broken … <#m-7055658538261792368> On Mon, 31 Jan 2022 at 08:56, DPY @.> wrote: This is an audio from anyaubio http://anyaudio.net/audiodownload?audio=TWu4HZNSSH0NTk http://anyaudio.net/audiodownload?audio=TWu4HZNSSH0NTk, listen to it to get an idea of how well zdsr supports the speed of the sapi5 speech synthesizer. — Reply to this email directly, view it on GitHub <#13284 (comment) <#13284 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJX6EAF7ME7CRXIISTTUYY6ELANCNFSM5M7BHI6A https://github.com/notifications/unsubscribe-auth/AT2FKJX6EAF7ME7CRXIISTTUYY6ELANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.> Thanks for reminding the above link has been re-edited — Reply to this email directly, view it on GitHub <#13284 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AT2FKJQZM5XU2TGEONDY6EDUY2GFBANCNFSM5M7BHI6A . You are receiving this because you authored the thread.Message ID: @.>

yes

king-dahmanus commented 2 years ago

hey, developers, please look at this!

Adriani90 commented 1 year ago

I did some tests and found following after some minutes of using:

eSpeak: <1 ms to 5 ms time between key press and reporting, very rarely 8 ms or more
Onecore: 5 to 10 MS time between key press and reporting, rather 8 MS or more, rarely below 8 MS
Sapi5: at least 8 MS time between key press and reporting, usually 10 to 15 MS, rarely less than 10 MS.

taking eSpeak as reference, the expected behavior is to have all synths at the same performance level.

I tested with NVDA alpha-28179,345154a6 (2023.2.0.28179), WASAPI enabled, by using arrow keys in browse mode in Google Chrome 112, which is very responsive. My 64bit Asus ROG strix machine has following configuration: Procesor 12th Gen Intel(R) Core(TM) i9-12900H, 2500 MHz, 14 core(s), 20 logical treats installed physical RAM 32,0 GB Intel(R) IRIS grafic card total capacity 16 gb, VRAM = 128 MB NVIDIA GeForce RTX 3070 Ti Grafic card, total capacity 24 GB, VRAM = 8 GB

As you can see, even on this machine there is a noticeable performance difference, so speaking about low end machines, the performance degradation between synths might be much more obvious.

cc: @jcsteh, @michaelDCurran

jcsteh commented 1 year ago

While it's possible there is some silence at the start of the audio buffer returned by these voices, it's also possible (I'd guess more likely) that these voices just take longer to synthesise speech. In that case, there's really nothing that can be done; the performance optimisation would need to happen in the voice itself.

For OneCore at least, if you already have a way to measure the time between key press and actual audio output, I'd suggest comparing with Narrator. That will give you an indication of whether this is something specific to NVDA or whether the voice itself is slow to respond.

cary-rowen commented 1 year ago

Narrator performance is worse than NVDA, I recommend using ZDSR to compare with NVDA, the response speed of zdsr is significantly better than NVDA. Even if both use SAPI5.

jcsteh commented 1 year ago

Is that true for OneCore with ZDSR even with the latest responsiveness and WASAPI changes in alpha?

SAPI5 is a different case, as NVDA uses SAPI5's own audio output rather than NVDA's audio output. It's possible that switching to nvwave + WASAPI for SAPI5 might improve responsiveness, but I'm not sure.

seanbudd commented 12 months ago

Are there any responsiveness issues remaining now that NVDA uses WASAPI?

jcsteh commented 11 months ago

Note that NVDA still doesn't use nvwave for SAPI5, so there won't be a change for SAPI5 now in terms of audio. However, the other responsiveness changes in the last few months might have some impact.

cary-rowen commented 11 months ago

Frankly, there are no noticeable changes. I do think there's a lot of room for improvement in NVDA's responsiveness.

jcsteh commented 11 months ago

Given that there has been at least a measurable 10 to 30 ms improvement in responsiveness in NVDA in the last few months, not accounting for WASAPI, the fact that you're seeing "no noticeable changes" would suggest you're seeing a delay which is significantly larger than 30 ms with OneCore. That certainly doesn't match my experience, nor does it match https://github.com/nvaccess/nvda/issues/13284#issuecomment-1533373914. That further suggests that there is a significant difference on your system as compared to mine and others.

As it stands, this issue isn't actionable. To get any further here, we're going to need precise information about which OneCore voice you're using, the rate it's configured at, probably audio recordings demonstrating the performance issue you're seeing, etc.

beqabeqa473 commented 2 months ago

Hello. I can confirm, that sapi5 in nvda is not as performant as in other places, and yes, this is because of sapi5 outputting sound itself. I am sure this will be improved, if sapi5 will go through nvda itself.

nvaccess / nvda