Describe the bug
I had initially tried to have it say simple things like, "Hello. Hello? I did that. I did? No? No. No?!" and so on, to try to hear variation in how text was spoken; like, a ramp toward a slightly higher pitch toward the end of a sentence, that sort of thing. However, I heard practically no variation in the default en_UK/apope_low voice no matter what I tried, so assumed it just couldn't make different sounds for the same words.
However, after discovering that the non-default voices would always speak slightly differently every time I'd have them say the same thing, I started doing more testing.. And found that even with --noise-scale 0 --noise-w 0 I could get these voices to have things like that ramped up pitch at the end if I unambiguously worded a sentence like a question to begin with.
This seems most consistent with the en_US/ljspeech_low voice. The others often do sound like they're saying a question, but it's ambiguous. This.. Works, but not well.
To Reproduce
Compare the output audio for the following commands:
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'Where was it?'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'Where was it.'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'That was it?'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'That was it.'
Expected behavior
The 'was it' at the end of commands 1 and 3 above should be spoken as if they were questions. The 'was it' at the end of commands 2 and 4 should be spoken as if they were statements.
Instead, it speaks 1 and 2 completely identically, as if both of them are questions. It speaks 3 and 4 identically as well, but as if they are both statements.
Environment (please complete the following information):
Device type: Desktop
OS: KDE Neon (based on Ubuntu 22.04)
Mycroft-core version: I only have mycroft-mimic3-tts, version 0.2.4.
Describe the bug I had initially tried to have it say simple things like, "Hello. Hello? I did that. I did? No? No. No?!" and so on, to try to hear variation in how text was spoken; like, a ramp toward a slightly higher pitch toward the end of a sentence, that sort of thing. However, I heard practically no variation in the default
en_UK/apope_low
voice no matter what I tried, so assumed it just couldn't make different sounds for the same words.However, after discovering that the non-default voices would always speak slightly differently every time I'd have them say the same thing, I started doing more testing.. And found that even with
--noise-scale 0 --noise-w 0
I could get these voices to have things like that ramped up pitch at the end if I unambiguously worded a sentence like a question to begin with.This seems most consistent with the
en_US/ljspeech_low
voice. The others often do sound like they're saying a question, but it's ambiguous. This.. Works, but not well.To Reproduce Compare the output audio for the following commands:
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'Where was it?'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'Where was it.'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'That was it?'
mimic3 -v en_US/ljspeech_low --noise-scale 0 --noise-w 0 'That was it.'
Expected behavior The 'was it' at the end of commands 1 and 3 above should be spoken as if they were questions. The 'was it' at the end of commands 2 and 4 should be spoken as if they were statements.
Instead, it speaks 1 and 2 completely identically, as if both of them are questions. It speaks 3 and 4 identically as well, but as if they are both statements.
Environment (please complete the following information):
mycroft-mimic3-tts
, version 0.2.4.