bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.18k stars 165 forks source link

--preserve-punctuation doesn't preserve parentheses #107

Closed EtienneAb3d closed 1 year ago

EtienneAb3d commented 2 years ago

Describe the bug Comma and some punctuations are well-preserved, but not parentheses, brackets, and braces.

Phonemizer version The output of phonemize --version from command line, very helpfull!

phonemizer-3.0.1
available backends: espeak-ng-1.50, espeak-mbrola, segments-2.2.0
uninstalled backends: festival

System Ubuntu 21.10 Python 3.9.7

To reproduce

echo "I would like a (big) steack, in a [large] hamburger {yes}!" | phonemize -l en-gb --preserve-punctuation

aɪ wʊd laɪk ɐ bɪɡ stiːk, ɪn ɐ lɑːdʒ hambɜːɡə jɛs!

Expected behavior parentheses, brackets, braces, and other kinds of punctuations should be provided in the output.

hadware commented 2 years ago

@mmmaat & @jncasey do you think that we should add these punctuation ((){}) marks to the _DEFAULT_MARKS constant?

jncasey commented 2 years ago

Before the feature to define punctuation with regex, I was typically using ;:,.!?¡¿—…"“”-()‘’*[], so It'd make sense to me to add (){}[] to the defaults.

mmmaat commented 2 years ago

Sure I agree, why not!