When phonemizing a text whick has more than 100k utterances, it will always gives a "RuntimeError"

bootphon / phonemizer

Simple text to phones converter for multiple languages

https://bootphon.github.io/phonemizer/

GNU General Public License v3.0

1.18k stars 165 forks source link

When phonemizing a text whick has more than 100k utterances, it will always gives a "RuntimeError" #94

Closed marianasignal closed 2 years ago

marianasignal commented 2 years ago

Describe the bug

When phonemizing a text whick has more than 100k utterances, it will always gives a "RuntimeError" include "espeak not installed on your system"，“failed to find espeak library” and "invalid voice code 'cmn' " at around 900 utterances.

Phonemizer version phonemizer-3.0 available backends: espeak-ng-1.49.2, espeak-mbrola, festival-2.5.0, segments-2.2.0

System cat /proc/version: Linux version 4.15.0-106-generic (buildd@lcy01-amd64-016) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04))

python: Python 3.9.1 (default, Dec 11 2020, 14:32:07) [GCC 7.3.0] :: Anaconda, Inc. on linux

To reproduce

txtdict = txt2dict(text_path)

with open(scp_path) as f:
    for line in f.readlines():
        txt = txtdict.get(line[0])
        phone = phonemize(txt, backend='espeak', language='cmn', 
                        separator=Separator(word='/', phone=' ', syllable="-"))
        rows.append([wav, new_wav, txt, phone, new_phone])

Expected behavior

mmmaat commented 2 years ago

Is it better if you transform your code sample by

from phonemizer.backend import EspeakBackend
backend = EspeakBackend('cmn')
separator = Separator(word='/', phone=' ', syllable="-")
with open(scp_path) as f:
    for line in f.readlines():
        txt = txtdict.get(line[0])
        phone = backend.phonemize(txt, separator=separator)
        rows.append([wav, new_wav, txt, phone, new_phone])

marianasignal commented 2 years ago

it works, thanks for your suggestion!

marianasignal commented 2 years ago

Is it better if you transform your code sample by

from phonemizer.backend import EspeakBackend
backend = EspeakBackend('cmn')
separator = Separator(word='/', phone=' ', syllable="-")
with open(scp_path) as f:
    for line in f.readlines():
        txt = txtdict.get(line[0])
        phone = backend.phonemize(txt, separator=separator)
        rows.append([wav, new_wav, txt, phone, new_phone])

When i try your suggestion i get the different result. I want to get the first result and how can I get it by modifying the config?

marianasignal commented 2 years ago

maybe something gets wrong?

mmmaat commented 2 years ago

Ok I see, the text must be a list of str in backend.phonemize, and not a str. I modified the code so that it now raises an exception : https://github.com/bootphon/phonemizer/commit/db89aafa7ca4f590630480c825c85a5bde78e57b. Just have a backend.phonemize([text], separator=separator) and it should work.

marianasignal commented 2 years ago

that's great