hadware / voxpopuli

Python wrapper for Espeak and Mbrola, for simple local TTS
MIT License
28 stars 16 forks source link

Can't generate English phonemes from some words containing letter N #5

Closed klvbdmh closed 7 years ago

klvbdmh commented 7 years ago

Code that reproduces the issue

from voxpopuli import Voice

voice = Voice(lang="en")
print(voice.to_phonems("second").phonemes_str)

Expected behavior

The phonemes list is printed

Observed behavior

IndexError is raised:

Traceback (most recent call last):
  File "D:/Dev/voicesynth/speakeasy.py", line 6, in <module>
    print(voice.to_phonems("second").phonemes_str)
  File "D:\Dev\voicesynth\voxpopuli\main.py", line 187, in to_phonems
    return self._str_to_phonems(quote(text))
  File "D:\Dev\voicesynth\voxpopuli\main.py", line 156, in _str_to_phonems
    .decode("utf-8")
  File "D:\Dev\voicesynth\voxpopuli\phonems.py", line 40, in __init__
    super().__init__([Phonem.from_str(pho_str) for pho_str in pho_str_list.split("\n") if pho_str])
  File "D:\Dev\voicesynth\voxpopuli\phonems.py", line 40, in <listcomp>
    super().__init__([Phonem.from_str(pho_str) for pho_str in pho_str_list.split("\n") if pho_str])
  File "D:\Dev\voicesynth\voxpopuli\phonems.py", line 27, in from_str
    name = split_pho.pop(0)  # type:str
IndexError: pop from empty list

Comments

Interestingly, it works when I change language to French or when I try the word corner in English.

klvbdmh commented 7 years ago

BTW, shouldn't it be to_phonemes instead of to_phonems? The English singular is a phoneme.

hadware commented 7 years ago

I'll investigate about the bug, but you're right about the Phonemes things (https://en.wiktionary.org/wiki/phoneme#Noun). I'll change it right away.

Thanks for the level of professionalism of your bug reports btw, i really appreciate it!

klvbdmh commented 7 years ago

Hah, thanks! I picked it up from Joel on Software.

Thanks for fixing spelling - only one small correction left (see #6).

hadware commented 7 years ago

Merged. Regarding the original bug, could you print out the string of phonemes returned by espeak, before it's "parsed" by the PhonemList class? I might have an idea.

klvbdmh commented 7 years ago

Here:

s       107
e       53       0 105 80 81 100 81
k       110
@       37       0 84 80 78 100 78
n       91      100 73

d       70
_       350
_       1

The blank line is causing the problems.

klvbdmh commented 7 years ago

The problem is once again Windows-specific. Newlines in Windows are created with CRLF line breaks, which include \r character. Stripping the troublesome line fixes the problem.