Open reubenadams opened 3 weeks ago
Thanks for the PR. As you see this is breaking some tests (actually this completely hide mbrola voices and break the mbrola backend on mac and linux).
Can you please try the following and let me know if the tests are passing (just run pytest
from root phonemizer directory) and Japanese is working as expected?
EspeakWrapper.available_voices
EspeakBackend.supported_languages
by the following: @classmethod
def supported_languages(cls):
return {
voice.language: voice.name
for voice in EspeakWrapper().available_voices()
# ignore mbrola voices causing a bug on windows (see #146)
if 'mb/' not in voice.identifier}
For me (on linux) all the tests are passing and it should fix the bug on windows, by ognoring mbrola voices when using the espeak backend.
I've done as you requested. I get three errors, all because I don't have festival installed:
I have not installed festival because I found the link in the documentation to the install confusing: http://www.festvox.org/docs/manual-2.4.0/festival_6.html#Installation.
Unfortunately the error for Japanese has returned: RuntimeError: failed to load voice "ja"
I thought this might be because you have a forward slash in your suggested addition if 'mb/' not in voice.identifier
, but changing it to a backwards slash (or omitting the slash) does not resolve the issue. When I run it in debug mode I see that voice_code = 'ja'
as expected, but `voice_name = 'mb\mb-jp1', which is weird because I though we had excluded the mbrola voices. Note this is the case even if I reverse or omit the slash.
I'm afraid I don't know what to try next. Any ideas?
Okay I think I may have figured out why your solution didn't fix RuntimeError: failed to load voice "ja"
. The traceback points to self._espeak.set_voice(language)
in the EspeakBackend
. Now EspeakBackend
and EspeakMbrolaBackend
both inherit from BaseEspeakBackend
, the init method of which sets self._espeak = EspeakWrapper()
. But the set_voice
method of EspeakWrapper
is not sensitive to whether the backend is an EspeakBackend
or an EspeakMbrolaBackend
:
def set_voice(self, voice_code):
"""Setup the voice to use for phonemization
Parameters
----------
voice_code (str) : Must be a valid language code that is actually
supported by espeak
Raises
------
RuntimeError if the required voice cannot be initialized
"""
if 'mb' in voice_code:
# this is an mbrola voice code. Select the voice by using
# identifier in the format 'mb/{voice_code}'
available = {
voice.identifier[3:]: voice.identifier
for voice in self.available_voices('mbrola')}
else:
# this are espeak voices. Select the voice using it's attached
# language code. Consider only the first voice of a given code as
# they are sorted by relevancy
available = {}
for voice in self.available_voices():
if voice.language not in available:
available[voice.language] = voice.identifier
try:
voice_name = available[voice_code]
except KeyError:
raise RuntimeError(f'invalid voice code "{voice_code}"') from None
if self._espeak.set_voice_by_name(voice_name.encode('utf8')) != 0:
raise RuntimeError( # pragma: nocover
f'failed to load voice "{voice_code}"')
voice = self._get_voice()
if not voice: # pragma: nocover
raise RuntimeError(f'failed to load voice "{voice_code}"')
self._voice = voice
So when I run
import phonemizer
print(phonemizer.phonemize("ほたる", language="ja", backend="espeak"))
it goes into the else
block and picks out the first voice
in self.available_voices()
for each language
, which for language="ja"
is 'ja': 'mb\\mb-jp1'
. This then triggers
if self._espeak.set_voice_by_name(voice_name.encode('utf8')) != 0:
raise RuntimeError( # pragma: nocover
f'failed to load voice "{voice_code}"')
in the set_voice
method above.
I can think of three approaches to solving this:
EspeakBackend
and EspeakMbrolaBackend
, pass the name of the backend (mbrola or espeak) to the parent class so the set_voice
method above knows to exclude mbrola voices if it was passed backend=espeak
. I could do this, but I suspect that on Linux EspeakBackend
should collect mbrola voices if mbrola is installed? I'm a bit confused about the relationship between mbrola and espeak voices; are mbrola voices a subset of espeak voices?else
block, exclude mbrola voices if mbrola is not installed. I'm afraid I don't know how to do this.available_voices
method of the EspeakWrapper
, perhaps checking there whether mbrola is installed.If the first approach makes sense then I can do it, but I suspect it's still on the wrong track. Otherwise I think I will have to leave this PR as I'm not confident I can finish it without help!
Summary: Implements the fix suggested in issue #146 by skipping voices starting with 'mb' in the EspeakWrapper.
As far as I understand it, the phonemizer uses the first voice that matches the IETF language tag, which for e.g. Japanese is an MBROLA voice. For a Windows user who cannot install MBROLA and therefore does not have the espeak-mbrola backend, this leads to the error RuntimeError: failed to load voice "ja".
The fix in issue #146 suggests to simply skip any voices starting with 'mb' in the
available_voices
method of the EspeakWrapper class, which is what I've done.This is my first PR, so I've probably not implemented this in the ideal way. Sorry about that!