Open ggilder opened 4 months ago
Here's another odd one that seems similar: annualize
gets phonemized as:
["ae"], ["uw"], ["ax"], ["l", "ay", "z"]
Again the double consonant seems to get ignored.
Again the double consonant seems to get ignored True, enough : we can see a long list of words beginning with "ann" in festival/lib/dicts/cmu/cmudict_extensions.scm. ( There is even an entry for "annualized") In the documentation of this file, we read
;;; Extra items that have sufficient frequency and are pronounced ;;; wrongly that are to be added to the compiled version of ;;; the CMULEX lexicon
just put at the top your corrections (no need to sort)
("annualize" nil (((ae) 1) ((n y uw) 0) ((w ax) 0) ((l ay z) 1)))
Then you compile festival/lib/dicts/cmu/Makefile to enjoy your additions
As for the actual maintainers, maybe you could contact http://www.linkedin.com/in/kevinlenzo the only name I found on http://www.speech.cs.cmu.edu/cgi-bin/cmudict)
@ddavout interesting, I don't see the file you mention in the latest source on github. Perhaps you have a different version?
I contacted the Centre for Speech Technology Research at University of Edinburgh and the director told me that they no longer actively maintain Festival, so I'm not sure what the status of this repo is. Perhaps @lenzo-ka can speak to this as I see they committed the most recent merges to this repo?
Its me @.***) who is the remaining maintainer of the whole festvox suite including Festival and the Edinburgh Speech Tools, but as I'm retired I'm not as active.
The problem on pronunciations above are the letter to sound rules, It's really that the model used is a context independent model on the output, it really needs a language model on the predicted phonemes. The most typical consequence of this is double letter consonants leading to no corresponding phoneme.
The real solution is a better model (which we have done before, a simple n gram phone model and a viterbi decoder), but that's a more complex model that we currently have. All more modern neural based models effectively do the two parts (local and global) in the same model, but depending on the ever changing neural models would make things significantly less portable.
The short answer is add the failing words to the lexicon itself, you can do this with
(lex.add.entry '("hippos" nil (((hh ih) 1) ((p ow z) 0))))
You'll see a whole bunch of these additions in the file festival/lib/dicts/cmu/cmulex.scm
Note you have to do the lex.add.entry after you've selected the lexicon/voice.
Alan
On Thu, Mar 28, 2024 at 3:18 PM Gabriel Gilder @.***> wrote:
@ddavout https://github.com/ddavout interesting, I don't see the file you mention in the latest source on github. Perhaps you have a different version?
I contacted the Centre for Speech Technology Research at University of Edinburgh and the director told me that they no longer actively maintain Festival, so I'm not sure what the status of this repo is. Perhaps @lenzo-ka https://github.com/lenzo-ka can speak to this as I see they committed the most recent merges to this repo?
— Reply to this email directly, view it on GitHub https://github.com/festvox/festival/issues/72#issuecomment-2026303143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOEXNFUYOA7UCDOKW5MEO3Y2SQK7AVCNFSM6AAAAABE52GA2KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRWGMYDGMJUGM . You are receiving this because you are subscribed to this thread.Message ID: @.***>
wget http://www.festvox.org/packed/festival/2.5/festlex_CMU.tar.gz you can find this link inside the script default_voices.sh in festival/src/scripts (Thanks Maud :) )
@awbcmu I was wandering ... Could you enable a free github discussion around this project: (https://github.com/features/discussions)
(Please let me know if there's a better forum to report pronunciation issues)
I noticed that on the most recent version of Festival, there's a very odd phonemization for the word "hippos", using language
en-us
:['hh', 'ih'], ['ow', 'z']
The p's have gone missing... "hippo" singular works fine, oddly.