Kyubyong / g2p

g2p: English Grapheme To Phoneme Conversion
Apache License 2.0
811 stars 128 forks source link

Mid-word hyphens are removed, should be treated similar to spaces #31

Open Sobsz opened 2 years ago

Sobsz commented 2 years ago

For example, running G2P on "text-to-speech" returns ['T', 'EH1', 'K', 'S', 'T', 'S', 'P', 'EH2', 'K'], the same as "texttospeech", when it should return something closer to ['T', 'EH1', 'K', 'S', 'T', ' ', 'T', 'UW1', ' ', 'S', 'P', 'IY1', 'CH'], the result for "text to speech" (though the stress could use some adjustment).

Simple workaround for now: use .replace("-", " ") on the input being passed in.