festvox / festival

Festival Speech Synthesis System
Other
376 stars 58 forks source link

Clearly wrong pronunciation #72

Open ggilder opened 4 months ago

ggilder commented 4 months ago

(Please let me know if there's a better forum to report pronunciation issues)

I noticed that on the most recent version of Festival, there's a very odd phonemization for the word "hippos", using language en-us:

['hh', 'ih'], ['ow', 'z']

The p's have gone missing... "hippo" singular works fine, oddly.

ggilder commented 4 months ago

Here's another odd one that seems similar: annualize gets phonemized as: ["ae"], ["uw"], ["ax"], ["l", "ay", "z"] Again the double consonant seems to get ignored.

ddavout commented 4 months ago

Again the double consonant seems to get ignored True, enough : we can see a long list of words beginning with "ann" in festival/lib/dicts/cmu/cmudict_extensions.scm. ( There is even an entry for "annualized") In the documentation of this file, we read

;;; Extra items that have sufficient frequency and are pronounced ;;; wrongly that are to be added to the compiled version of ;;; the CMULEX lexicon

just put at the top your corrections (no need to sort) ("annualize" nil (((ae) 1) ((n y uw) 0) ((w ax) 0) ((l ay z) 1)))

Then you compile festival/lib/dicts/cmu/Makefile to enjoy your additions

As for the actual maintainers, maybe you could contact http://www.linkedin.com/in/kevinlenzo the only name I found on http://www.speech.cs.cmu.edu/cgi-bin/cmudict)

ggilder commented 3 months ago

@ddavout interesting, I don't see the file you mention in the latest source on github. Perhaps you have a different version?

I contacted the Centre for Speech Technology Research at University of Edinburgh and the director told me that they no longer actively maintain Festival, so I'm not sure what the status of this repo is. Perhaps @lenzo-ka can speak to this as I see they committed the most recent merges to this repo?

awbcmu commented 3 months ago

Its me @.***) who is the remaining maintainer of the whole festvox suite including Festival and the Edinburgh Speech Tools, but as I'm retired I'm not as active.

The problem on pronunciations above are the letter to sound rules, It's really that the model used is a context independent model on the output, it really needs a language model on the predicted phonemes. The most typical consequence of this is double letter consonants leading to no corresponding phoneme.

The real solution is a better model (which we have done before, a simple n gram phone model and a viterbi decoder), but that's a more complex model that we currently have. All more modern neural based models effectively do the two parts (local and global) in the same model, but depending on the ever changing neural models would make things significantly less portable.

The short answer is add the failing words to the lexicon itself, you can do this with

(lex.add.entry '("hippos" nil (((hh ih) 1) ((p ow z) 0))))

You'll see a whole bunch of these additions in the file festival/lib/dicts/cmu/cmulex.scm

Note you have to do the lex.add.entry after you've selected the lexicon/voice.

Alan

On Thu, Mar 28, 2024 at 3:18 PM Gabriel Gilder @.***> wrote:

@ddavout https://github.com/ddavout interesting, I don't see the file you mention in the latest source on github. Perhaps you have a different version?

I contacted the Centre for Speech Technology Research at University of Edinburgh and the director told me that they no longer actively maintain Festival, so I'm not sure what the status of this repo is. Perhaps @lenzo-ka https://github.com/lenzo-ka can speak to this as I see they committed the most recent merges to this repo?

— Reply to this email directly, view it on GitHub https://github.com/festvox/festival/issues/72#issuecomment-2026303143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOEXNFUYOA7UCDOKW5MEO3Y2SQK7AVCNFSM6AAAAABE52GA2KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRWGMYDGMJUGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ddavout commented 3 months ago

wget http://www.festvox.org/packed/festival/2.5/festlex_CMU.tar.gz you can find this link inside the script default_voices.sh in festival/src/scripts (Thanks Maud :) )

ddavout commented 3 months ago

@awbcmu I was wandering ... Could you enable a free github discussion around this project: (https://github.com/features/discussions)