bootphon / phonemizer

Simple text to phones converter for multiple languages
https://bootphon.github.io/phonemizer/
GNU General Public License v3.0
1.15k stars 163 forks source link

phonemize joins phonemes from text list when preserve_punctuation=True #128

Closed alexdemartos closed 2 years ago

alexdemartos commented 2 years ago

Describe the bug EspeakBackend.phonemize joins phonemes from text list

Phonemizer version It happens with versions 3.1.1 and 3.2.0. I've tested 2.2.2 and it works as expected.

System Ubuntu 20.04, Python 3.9

Expected behavior It should return a list of phonemized strings, instead of a single element list with all joint phonemes from all input strings (list).

Input: ['worked david ford i started in deloitte and i was immediately', 'an offer of price waterhouse cooper and here i take may', 'we are now as maximum plan for a customer time and', "they're going to meet all the xvin so great it", 'you fucked at first but you have to endure you have to', 'endure and promote you to manager and then you can go', "fifty k and you're the fucking master you just sent me a mercedes", 'for the renting of the company', 'today is friday so they vacated after work although then', 'the end touches me pringar hands it to me and a thousand aunts come super', "arranged here you go crazy or i'll de concentrate what", 'says a tailor who told me my boss mola the cut', 'what a thousand turkeys bru forgive that i have a tail now blah me']

WRONG Output (v3.1.1 and beyond): ['wˈɜːkt dˈeɪvɪd fˈoːɹd ˈaɪ stˈɑːɹɾᵻd ɪn dᵻlˈɔɪt ænd ˈaɪ wʌz ɪmˈiːdɪətli ɐn ˈɔfɚɹ ʌv pɹˈaɪs wˈɔːɾɚhˌaʊs kˈuːpɚ ænd hˈɪɹ ˈaɪ tˈeɪk mˈeɪ wiː ɑːɹ nˈaʊ æz mˈæksɪməm plˈæn fɚɹə kˈʌstəmɚ tˈaɪm ænd ðeɪɚ ɡˌoʊɪŋ tə mˈiːt ˈɔːl ðɪ ˈɛksvˈɪn sˌoʊ ɡɹˈeɪt ɪt juː fˈʌkt æt fˈɜːst bˌʌt juː hæv tʊ ɛndˈʊɹ juː hˈæv tuː ɛndˈʊɹ ænd pɹəmˈoʊt juː tə mˈænɪdʒɚ ænd ðˈɛn juː kæn ɡˈoʊ fˈɪfti kˈeɪ ænd jʊɹ ðə fˈʌkɪŋ mˈæstɚ juː dʒˈʌst sˈɛnt mˌiː ɐ mɜːsˈeɪdiːz fɚðə ɹˈɛntɪŋ ʌvðə kˈʌmpəni tədˈeɪ ɪz fɹˈaɪdeɪ sˌoʊ ðeɪ vˈeɪkeɪɾᵻd ˈæftɚ wˈɜːk ɔːlðˈoʊ ðˈɛn ðɪ ˈɛnd tˈʌtʃᵻz mˌiː pɹˈɪŋɡɚ hˈændz ɪt tə mˌiː ænd ɐ θˈaʊzənd ˈænts kˈʌm sˈuːpɚ ɚɹˈeɪndʒd hˈɪɹ juː ɡˌoʊ kɹˈeɪzi ɔːɹ aɪl də kˈɑːnsəntɹˌeɪt wˈʌt sˈɛz ɐ tˈeɪlɚ hˌuː tˈoʊld mˌiː maɪ bˈɔs mˈoʊlə ðə kˈʌt wˌʌt ɐ θˈaʊzənd tˈɜːkiz bɹˈuː fɚɡˈɪv ðæt ˈaɪ hæv ɐ tˈeɪl nˈaʊ blˈɑː mˌiː ']

CORRECT Output (v2.2.2): ['wˈɜːkt dˈeɪvɪd fˈoːɹd ˈaɪ stˈɑːɹɾᵻd ɪn dᵻlˈɔɪt ænd ˈaɪ wʌz ɪmˈiːdɪətli ', 'ɐn ˈɔfɚɹ ʌv pɹˈaɪs wˈɔːɾɚhˌaʊs kˈuːpɚ ænd hˈɪɹ ˈaɪ tˈeɪk mˈeɪ ', 'wiː ɑːɹ nˈaʊ æz mˈæksɪməm plˈæn fɚɹə kˈʌstəmɚ tˈaɪm ænd ', 'ðeɪɚ ɡˌoʊɪŋ tə mˈiːt ˈɔːl ðɪ ˈɛksvˈɪn sˌoʊ ɡɹˈeɪt ɪt ', 'juː fˈʌkt æt fˈɜːst bˌʌt juː hæv tʊ ɛndˈʊɹ juː hˈæv tuː ', 'ɛndˈʊɹ ænd pɹəmˈoʊt juː tə mˈænɪdʒɚ ænd ðˈɛn juː kæn ɡˈoʊ ', 'fˈɪfti kˈeɪ ænd jʊɹ ðə fˈʌkɪŋ mˈæstɚ juː dʒˈʌst sˈɛnt mˌiː ɐ mɜːsˈeɪdiːz ', 'fɚðə ɹˈɛntɪŋ ʌvðə kˈʌmpəni ', 'tədˈeɪ ɪz fɹˈaɪdeɪ sˌoʊ ðeɪ vˈeɪkeɪɾᵻd ˈæftɚ wˈɜːk ɔːlðˈoʊ ðˈɛn ', 'ðɪ ˈɛnd tˈʌtʃᵻz mˌiː pɹˈɪŋɡɚ hˈændz ɪt tə mˌiː ænd ɐ θˈaʊzənd ˈænts kˈʌm sˈuːpɚ ', 'ɚɹˈeɪndʒd hˈɪɹ juː ɡˌoʊ kɹˈeɪzi ɔːɹ aɪl də kˈɑːnsəntɹˌeɪt wˈʌt ', 'sˈɛz ɐ tˈeɪlɚ hˌuː tˈoʊld mˌiː maɪ bˈɔs mˈoʊlə ðə kˈʌt ', 'wˌʌt ɐ θˈaʊzənd tˈɜːkiz bɹˈuː fɚɡˈɪv ðæt ˈaɪ hæv ɐ tˈeɪl nˈaʊ blˈɑː mˌiː ']

mmmaat commented 2 years ago

Hi, I can't reproduce your bug with the 3.2.0 version:

#!/usr/bin/env python

from phonemizer import phonemize, __version__

assert __version__ == '3.2.0'

text = [
    'worked david ford i started in deloitte and i was immediately',
    'an offer of price waterhouse cooper and here i take may',
    'we are now as maximum plan for a customer time and',
    "they're going to meet all the xvin so great it"]

phn = phonemize(text, language='en-us', backend='espeak')

assert phn == [
    'wɜːkt deɪvɪd foːɹd aɪ stɑːɹɾᵻd ɪn dᵻlɔɪt ænd aɪ wʌz ɪmiːdɪətli ',
    'ɐn ɔfɚɹ ʌv pɹaɪs wɔːɾɚhaʊs kuːpɚ ænd hɪɹ aɪ teɪk meɪ ',
    'wiː ɑːɹ naʊ æz mæksɪməm plæn fɚɹə kʌstəmɚ taɪm ænd ',
    'ðeɪɚ ɡoʊɪŋ tə miːt ɔːl ðɪ ɛksvɪn soʊ ɡɹeɪt ɪt ']

Can you give us the exact command and options you used?

alexdemartos commented 2 years ago
#!/usr/bin/env python

from phonemizer import phonemize,  __version__
from phonemizer.backend import EspeakBackend

assert __version__ == '3.2.0'

text = [
    'worked david ford i started in deloitte and i was immediately',
    'an offer of price waterhouse cooper and here i take may',
    'we are now as maximum plan for a customer time and',
    "they're going to meet all the xvin so great it"]

eb = EspeakBackend(
       language='en-us',
       punctuation_marks=';:,.!?¡¿—…"«»“”()',
       preserve_punctuation=True,
       with_stress=True,
       language_switch='remove-flags'
     )

phn = eb.phonemize(text)

assert phn == ['wˈɜːkt dˈeɪvɪd fˈoːɹd ˈaɪ stˈɑːɹɾᵻd ɪn dᵻlˈɔɪt ænd ˈaɪ wʌz ɪmˈiːdɪətli ɐn ˈɔfɚɹ ʌv pɹˈaɪs wˈɔːɾɚhˌaʊs kˈuːpɚ ænd hˈɪɹ ˈaɪ tˈeɪk mˈeɪ wiː ɑːɹ nˈaʊ æz mˈæksɪməm plˈæn fɚɹə kˈʌstəmɚ tˈaɪm ænd ðeɪɚ ɡˌoʊɪŋ tə mˈiːt ˈɔːl ðɪ ˈɛksvˈɪn sˌoʊ ɡɹˈeɪt ɪt ']
mmmaat commented 2 years ago

Ok thanks. This is a bug with preserve_punctuation=True

jncasey commented 2 years ago

I see the issue – with preserve_punctuation=True, the restore method merges all remaining text once the final punctuation mark is restored (in this case, all text since there was no punctuation to begin with). I can work on a fix soon.

jncasey commented 2 years ago

I was noticing a strange issue with batches of text being returned with fewer lines, but I hadn't tracked down the precise issue yet. This is almost definitely what was happening to me, too.