bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.19k stars 166 forks source link

inconsistant punctuation processing #26

Closed mmmaat closed 4 years ago

mmmaat commented 4 years ago

When a point is separating two sentences (in a single line of text), the output gives a single utterance. But when replacing the point by a comma or a semicolon, this outputs 2 utterances. See below:

$ echo 'a comma a point.' | phonemize
ɐ kɑːmə ɐ pɔɪnt 
$ echo 'a comma. a point.' | phonemize
ɐ kɑːmə ɐ pɔɪnt 
$ echo 'a comma; a point.' | phonemize
ɐ kɑːmə 
ɐ pɔɪnt 
$ echo 'a comma, a point.' | phonemize
ɐ kɑːmə 
ɐ pɔɪnt 
$ echo 'a comma? a point!' | phonemize
ɐ kɑːmə 
ɐ pɔɪnt 

The expected behavior would be to ignore punctuation.