bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.18k stars 165 forks source link

Can't use multiple EspeakBackend objects with njobs=1 #116

Open eeishaan opened 2 years ago

eeishaan commented 2 years ago

Describe the bug It seems that instantiation of multiple EspeakBackend objects is not correctly handled. All the objects start operating with the language used to instantiate the last object. Please refer to the example below.

Phonemizer version 3.0.1

System macOS 11.6.4 python 3.8.9 [Clang 13.0.0 (clang-1300.0.29.30)] on darwin

To reproduce

from phonemizer.backend import EspeakBackend

en_backend = EspeakBackend(
    "en-us",
    preserve_punctuation=True,
    with_stress=True,
    language_switch="remove-flags",
    words_mismatch="ignore",
)
en_sentence = ["I love to eat pizza everyday"]
print(en_backend.phonemize(en_sentence, njobs=1, strip=True)) # ['aɪ lˈʌv tʊ ˈiːt pˈiːtsə ˈɛvɹɪdˌeɪ']

de_backend = EspeakBackend(
    "de",
    preserve_punctuation=True,
    with_stress=True,
    language_switch="remove-flags",
    words_mismatch="ignore",
)
de_sentence = ["ich esse jeden tag gerne pizza."]
print(de_backend.phonemize(de_sentence, njobs=1, strip=True)) # ['ɪç ˈɛsə jˈeːdən tˈɑːk ɡˈɛɾnə pˈɪtsɑː.']

incorrect_en = en_backend.phonemize(en_sentence, njobs=1, strip=True)
en_with_de = de_backend.phonemize(en_sentence, njobs=1, strip=True)

assert en_with_de == incorrect_en
print(incorrect_en, en_with_de) 
# ['ˈiː lˈoːvə tˈoː eːˈɑːt pˈɪtsɑː ˈeːveːrˌyːdɛɪ'] 
# ['ˈiː lˈoːvə tˈoː eːˈɑːt pˈɪtsɑː ˈeːveːrˌyːdɛɪ']

Expected behavior Notice that incorrect_en is equal to en_with_de and not equal to en_sentence.

Additional context This problem happens only with njobs=1 and doesn't appear with njobs>1