Closed hakan-demirli closed 2 years ago
Hi, a very quick answer. To keep the '
(this is phone accentuation) in the phonemizer output, use the with_stress=True
parameter.
For the o
<-> ˈə
stuff I really don't know... Are you sure you are using the same library both sides?
I am sure the libraries are the same. C++ is using a CMakeLists file with an absolute path to *.so files. I have followed the phoemizer in debugger up until here: https://github.com/bootphon/phonemizer/blob/8cbdb5ade3dfaf931c63ecb3c6abf38a5b335d2f/phonemizer/backend/espeak/api.py#L223 and verified the library dependency by removing all espeak libraries and running the script again.
Thank you for explaining the accentuation. Now the outputs are looking quite close, and my project is working as expected. o
<-> ə
difference is not critical for me since this is a hobby project.
I have been using Phonemizer in a hobby project that I am currently trying to port to C++. I have read the code, and apart from input sanitization and basic pre/post-processing, there are only Espeak-ng system library bindings that access the same files C++ use. So, I can't see a reason why the outputs are different.
Phonemizer
həloʊ wɜːld
Espeak-nghəlˈəʊ wˈɜːld
I don't know much about phonetics. But, the only difference I see is
o
<->ˈə
andɜ
<->ˈɜ
. Can I just map all those differences and call it a day?Phonemizer: Latest version Espeak-ng: Latest version OS: Pop!_OS 22.04 LTS
Python Code:
CPP code: