bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.18k stars 165 forks source link

The punctuation mark is not separated correctly when using space as the phone separator in v0.3.1 #104

Closed kan-bayashi closed 1 year ago

kan-bayashi commented 2 years ago

Thank you for developing a great tool. This is very useful to develop our speech processing toolkit. I found unexpected behavior in v0.3.1.

Describe the bug The punctuation mark is not separated correctly when using space as the phone separator.

Phonemizer version

System

To reproduce

from phonemizer import phonemize
from phonemizer.separator import Separator

text = "Hello, world."
separator = Separator(
    word=None,
    syllable=None,
    phone=" ",
)
phn = phonemize(text, separator=separator, preserve_punctuation=True)
print(phn)

Expected behavior In phonemizer==0.3 (expected)

$ python test_phonemizer.py
h ə l oʊ , w ɜː l d .

In phonemizer==0.3.1 (seems wrong)

$ python test_phonemizer.py
h ə l oʊ, w ɜː l d.
mmmaat commented 1 year ago

in phonemizer-3.2.1this have been fixed by #119, the result is now h ə l oʊ ,w ɜː l d ., which is what is expected.