NeuralVox / OpenPhonemizer

An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.
https://neuralvox.github.io/OpenPhonemizer
BSD 3-Clause Clear License
84 stars 5 forks source link

Incorrect handling of some proper names #6

Closed geneing closed 4 months ago

geneing commented 6 months ago

For a test phrase:

I visited Bologna and ate a bologna sandwich.

Phonemizer correctly handles Bologna vs bologna:

aɪ vˈɪzɪɾᵻd bəlˈoʊnʲə ænd ˈeɪt ɐ bəlˈoʊni sˈændwɪtʃ .

OpenPhonemizer:

ˈaɪ vˈɪzɪɾᵻd bəlˈoʊni ænd ˈeɪt ɐ bəlˈoʊni sˈændwɪtʃ .

fakerybakery commented 6 months ago

Hi, unfortunately as DeepPhonemizer doesn’t use a rule-based system but instead trains a model on many words, it’s difficult to fix this issue besides adding that word to the training dataset. I’m planning to train a multilingual model soon and I’ll add that word.