nvaccess / nvda

NVDA, the free and open source Screen Reader for Microsoft Windows
https://www.nvaccess.org/
Other
2.11k stars 637 forks source link

ESpeak Voice Sounds Harsher in Master and Next Versions #5868

Closed dgoldfield closed 8 years ago

dgoldfield commented 8 years ago

The new ESpeak voices sound harsher, particularly with words with the letter V, such as "level." It sounds like it is saying "lebel" which becomes obvious when moving through headings on the Web and I hear items such as "heading lebel 1." Some words, such as "Internet" almost have a slight pop sound at the beginning, as though effects on my sound card were enabled. This is with English U.S. voices. This happened once before and it was addressed/I could try and locate the ticket if it would help.

derekriemer commented 8 years ago

Is this the same as the command line flag that was added a couple of years ago? I remember that coming across the list. I remember it had to do with the phoneme data being clipped at the end or something.

On 4/7/2016 5:51 PM, David Goldfield wrote:

The new ESpeak voices sound harsher, particularly with words with the letter V, such as "level." It sounds like it is saying "lebel" which becomes obvious when moving through headings on the Web and I hear items such as "heading lebel 1." Some words, such as "Internet" almost have a slight pop sound at the beginning, as though effects on my sound card were enabled. This is with English U.S. voices. This happened once before and it was addressed/I could try and locate the ticket if it would help.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/nvaccess/nvda/issues/5868


Derek Riemer

Websites: Honors portfolio http://derekriemer.com Awesome little hand built weather app! http://django.derekriemer.com/weather/

email me at derek.riemer@colorado.edu mailto:derek.riemer@colorado.edu Phone: (303) 906-2194

Brian1Gaff commented 8 years ago

Actually, I cannot hear this on the internal realtek hardware, but a behringer sound card on usb has the start and end clicks and a little more bass which does, on some voices mave Level sound a little like Label. MY guess is that its a sound card driver issue of some sort. Its not major but at times also tends to hide the end of a piece of speech with a click so it sounds truncated. Brian

bglists@blueyonder.co.uk Sent via blueyonder. Please address personal email to:- briang1@blueyonder.co.uk, putting 'Brian Gaff' in the display name field. ----- Original Message ----- From: "David Goldfield" notifications@github.com To: "nvaccess/nvda" nvda@noreply.github.com Sent: Friday, April 08, 2016 12:51 AM Subject: [nvaccess/nvda] ESpeak Voice Sounds Harsher in Master and Next Versions (#5868)

The new ESpeak voices sound harsher, particularly with words with the letter V, such as "level." It sounds like it is saying "lebel" which becomes obvious when moving through headings on the Web and I hear items such as "heading lebel 1." Some words, such as "Internet" almost have a slight pop sound at the beginning, as though effects on my sound card were enabled. This is with English U.S. voices. This happened once before and it was addressed/I could try and locate the ticket if it would help.


You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/nvaccess/nvda/issues/5868

dgoldfield commented 8 years ago

I'm willing to see if there is a driver update for my sound card but I can tell you that this is not occurring in 2016.1. On 4/8/2016 5:17 AM, Brian1Gaff wrote: Actually, I cannot hear this on the internal realtek hardware, but a behringer sound card on usb has the start and end clicks and a little more bass which does, on some voices mave Level sound a little like Label. MY guess is that its a sound card driver issue of some sort. Its not major but at times also tends to hide the end of a piece of speech with a click so it sounds truncated. Brian

bglists@blueyonder.co.ukmailto:bglists@blueyonder.co.uk Sent via blueyonder. Please address personal email to:- briang1@blueyonder.co.ukmailto:briang1@blueyonder.co.uk, putting 'Brian Gaff' in the display name field. ----- Original Message ----- From: "David Goldfield" notifications@github.commailto:notifications@github.com To: "nvaccess/nvda" nvda@noreply.github.commailto:nvda@noreply.github.com Sent: Friday, April 08, 2016 12:51 AM Subject: [nvaccess/nvda] ESpeak Voice Sounds Harsher in Master and Next Versions (#5868)

The new ESpeak voices sound harsher, particularly with words with the letter V, such as "level." It sounds like it is saying "lebel" which becomes obvious when moving through headings on the Web and I hear items such as "heading lebel 1." Some words, such as "Internet" almost have a slight pop sound at the beginning, as though effects on my sound card were enabled. This is with English U.S. voices. This happened once before and it was addressed/I could try and locate the ticket if it would help.


You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/nvaccess/nvda/issues/5868

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHubhttps://github.com/nvaccess/nvda/issues/5868#issuecomment-207343256

dgoldfield commented 8 years ago

3860 is exactly the same issue as what I'm hearing now, if that is helpful.

jcsteh commented 8 years ago

Any ideas, @MichaelDCurran? Seems we have #3860 again. Reading briefly, that was apparently due to badly compiled phoneme data.

michaelDCurran commented 8 years ago

@dgoldfield What exact eSpeak settings are you using? I.e. rate, rate boost, variant, pitch. Also, what kind of sound card? Can you confirm that the issue is not seen in 2016.1?

dgoldfield commented 8 years ago
  1. I can confirm that the change I've reported is not at all present in 2016.1.
  2. Using the English United States voice. Variant: male3 (although it does not just happen with this variant.) No rate boost. Rate:50. Pitch:
  3. Inflection: 100. Volume: 100.
  4. The only thing listed under sound card in Device Manager is "high definition audio device." Same is listed in System Information. If you believe that locating a more current driver will help I am willing to pursue this but again, this issue has not existed for years and only surfaced in the newer master and next branches.
michaelDCurran commented 8 years ago

@dgoldfield Any possibility of getting a recording of both 2016.1 and next? Annoyingly I cannot reproduce the issue yet on at least 2 machines. I remember the old bug, and that was caused by using an incorrect version of eSpeakEdit. However now eSpeak has the ability to compile the phoneme data itself. It is very possible that there is a bug in the compilation code... but Nothing can be done until it can be reproduced. what kind of an impact does this issue have? I.e. does it make eSpeak unusable for you?

dgoldfield commented 8 years ago

I can probably attach two audio samples, both from 2016.1 and a newer master. I wouldn't say it makes ESpeak unusable but it makes it unpleasant. I'll work on this but I won't be able to get to it until the weekend, unfortunately. Thanks for at least trying to track it down.

michaelDCurran commented 8 years ago

For me, if I do hear anything, it sounds as if next/master is slightly louder, and perhaps slightly compressed, compared to 2016.1. @dgoldfield Does the issue become less if you decrease eSpeak's volume to say 95 or 90? For now I'm thinking that eSpeak is simply producing audio louder than it should, and some sound cards are then compressing the audio.

michaelDCurran commented 8 years ago

Next/master certainly also has slightly different EQing. Less trebble perhaps. Also some kind of low shelf.

rhdunn commented 8 years ago

It would be useful to track down what is causing this difference.

On my GitHub espeak branch, I have been able to compile espeak on Linux for a long time (all the tags should be buildable). They will likely require some work to get them to build on Windows, but that could be useful trying to track down the cause of the issue.

Some things I want to test are:

  1. the version of espeak NVDA 2006.1 is using compared with the 1.48.15 version (the latest release from Jonathan);
  2. espeak 1.48.15 compared with the espeak-ng bulid;
  3. building the phoneme data on Windows and on Linux;
  4. building the phoneme data from the command line vs the espeakedit application.

This should help isolate where the issue is being introduced.

dgoldfield commented 8 years ago

I have recorded two separate .wav files. This system will not allow me to upload them, saying this type of file is not accepted. Here is a Dropbox link you can use to get the files.

https://www.dropbox.com/sh/jx7d0kfac0pm2rh/AADyorRuZl3zCRf0Eiq8HQnXa?dl=0

The current build audio file is using 2016.1 and the master build audio file is using a master from April 21. In the file, I alt-tab into the Jarte text editor which contains the following sentence.

Welcome to heading level 1. I am testing this synthesizer as I dialog with all of you about the various NVDA issues.

I have NVDA read the file and then I navigate through some of it word by word and then character by character. I then alt-tab back into Audacity and stop the recording. Both builds are using U.S. English, the Male 3 variant, rate at 45, pitch 51, volume 100 and inflection 100. Decreasing the volume in the master build does not solve the problem.

michaelDCurran commented 8 years ago

Please try the latest NVDA Next snapshot (13300,9ab71476 or later). This contains the latest update to espeak-ng that apparently may fix the issue. On my system the difference I was hearing seem to have gone away.

dgoldfield commented 8 years ago

Congratulations to both NV Access and the ESpeak development team for this work. Yes, I believe it is fixed. At first, I wasn't sure as the two versions do have some differences but I think the differences I'm now hearing are changes to some of the phonemes. However, most of that harshness is gone and, in some ways, I think I'm even liking the new version a bit better. Thank you to all of you for your willingness to track this down. ESpeak is actually my preferred synthesizer when using NVDA and it's nice to know I won't need to switch to something else. Many thanks.