MycroftAI / mimic1

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
https://mimic.mycroft.ai
Other
811 stars 152 forks source link

Mimic doesn't read whole text #18

Closed testman42 closed 8 years ago

testman42 commented 8 years ago

In command line I tried running ./mimic -t where string was about 1800 characters long, but Mimic didn't read the whole thing. It got to about one third of the text and then just stopped mid-word at about 460 character. All characters were either letters, numbers or punctuation, combined into proper grammatical sentences. I tried with other long texts and got same result.

I use updated Fedora 23 x86_64 and BASH as command line interpreter.

forslund commented 8 years ago

Tried it just now with a text 1869 characters long and it worked as far as I can determine. (The last sentence spoken/printed matched the last sentence of the text)

Can you run ./mimic -pw -t "[long text]" | wc -m and check how many characters is printed by mimic?

testman42 commented 8 years ago

Result differs depending on text.

for example, your command returns 789 for this text:

We're no strangers to love You know the rules and so do I A full commitment's what I'm thinking of You wouldn't get this from any other guy I just want to tell you how I'm feeling Gotta make you understand Never gonna give you up, never gonna let you down Never gonna run around and desert you Never gonna make you cry, never gonna say goodbye Never gonna tell a lie and hurt you We've known each other for so long Your heart's been aching but you're too shy to say it Inside we both know what's been going on We know the game and we're gonna play it And if you ask me how I'm feeling Don't tell me you're too blind to see Never gonna give you up, never gonna let you down Never gonna run around and desert you Never gonna make you cry, never gonna say goodbye Never gonna tell a lie and hurt you

Which seems to be the right output. And I removed "wc -m" and yes, Mimic does print out everything it was supposed to say. However, for me Mimic stopped talking at "And if you ask me how I'm feeling Don't tell me" Video as proof: https://drive.google.com/file/d/0B5M_1gBxnWnhcFA1S1c4YTJvWnM/view?usp=sharing

This return 1255 and gets to "GNU system every day with" which is again like in my first post around 460 character:

I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called "Linux", and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/Linux. All the so-called "Linux" distributions are really distributions of GNU/Linux.

https://drive.google.com/file/d/0B5M_1gBxnWnha09rd0hPUVZoZWc/view?usp=sharing

This gets to 644 but stops mid word at "twigs, bark, leaves or flowers are mime":

Mimicry is related to camouflage, in which a species resembles its surroundings or is otherwise difficult to detect. In particular, mimesis, in which the mimic takes on the properties of a specific object or organism, but one to which the dupe is indifferent, is an area of overlap between camouflage and mimicry. For example, animals such as flower mantises, planthoppers and geometer moth caterpillars that resemble twigs, bark, leaves or flowers are mimetic. The difficulty is sometimes avoided by choosing a different term; crypsis (in the broad sense) is sometimes used to encompass all forms of avoiding detection, such as mimicry, camouflage and hiding.

forslund commented 8 years ago

Excellent! With these strings I reproduce the issue exactly as you describe it. I'll do some debugging tonight and see if I can track down why syllables are produced but the voice is cut.

forslund commented 8 years ago

@testman42 If you like you could try the branch long-text-fix. It's a work-around and I still feel the root cause should be determined before merging with the development branch.

testman42 commented 8 years ago

I downloaded and compiled the branch, but sadly I still encounter this bug.

forslund commented 8 years ago

Hmmm for me it works with the mimicry and GNU/Linux text...will dig more...

forslund commented 8 years ago

Ok, first text works if I add punctuations. Might be some limit to the length of sentences that can be synthesized.

testman42 commented 8 years ago

Oh, yes, you are right. It works if text has proper grammar. It was my mistake for testing long-text-fix branch with improper text.

forslund commented 8 years ago

Not your fault, it's still a failure and should not occur. It should at least produce a warning... Den 9 mar 2016 1:03 em skrev "testman42" notifications@github.com:

Oh, yes, you are right. It works if text has proper grammar. It was my mistake for testing long-text-fix branch with improper text.

— Reply to this email directly or view it on GitHub https://github.com/MycroftAI/mimic/issues/18#issuecomment-194264977.

forslund commented 8 years ago

Small status update on this issue: I tried to work on this on the train today and didn't seem to be able to reproduce it on my eeePC with Debian. This made me think that it might be the alsa driver that's cutting off somehow. I've now tested with pulseaudio on my work laptop (where alsa output cuts off) and it reads the entire text. So it seems it's some size limit on alsa on some distributions or maybe hardware? I'll see if I can isolate the issue further and perhaps figure out a workaround.

zeehio commented 8 years ago

Could you please add a warning here in your ALSA tests and check if it pops out? https://github.com/MycroftAI/mimic/blob/master/src/audio/au_alsa.c#L223

The warning message can be similar to: https://github.com/zeehio/festival_suite/blob/master/speech_tools/audio/linux_sound.cc#L622

This message may help finding out where the issue comes from.

forslund commented 8 years ago

Thanks for the suggestion, however the warning I added never showed.

It seems as if snd_pcm_drain() in audio_close_alsa() returns to early. I've checked the return value and it seems ok (0).

If I add (what I think is) a manual check instead of drain

while(snd_pcm_avail(pcm_handle) > 0)
    usleep(10000);

it plays the entire waveform. Not sure what this tells us about the issue though...

zeehio commented 8 years ago

What about adding a snd_pcm_prepare after drain and before close? in here (or a call to audio_flush_alsa)

Sorry for not being able to test by myself :-/

forslund commented 8 years ago

No need to be sorry, I'm on a train and can't do much else then test =) (although I'm getting tired of hearing "Never gonna give you up" over and over again)

None of your suggestions worked but I tried using snd_pcm_wait(pcm_handle, -1) instead. And this worked nicely. Do you think it's a good enough solution or can other issues pop up because of it?

forslund commented 8 years ago

Ignore previous success...

zeehio commented 8 years ago

If you have PulseAudio, then test with

pasuspender -- mimic ...

Just to discard a PulseAudio issue

forslund commented 8 years ago

It might be a pulseaudio issue, my eeePC doesn't have the problem and I'm not running pulseaudio on that one.

using pasuspender I get no sound output at all. timing the commands I get no real difference (~0.05 seconds) so I think the pasuspended mimic cuts off at the same time as running without it.

I think I'll have to read up on how snd_pcm_drain() works and how it is different from the working snd_pcm_avail().

zeehio commented 8 years ago

@forslund . In audio_close_alsa, before snd_pcm_drain() add snd_pcm_nonblock(pcm_handle, 0). That should hopefully make it...

forslund commented 8 years ago

@zeehio, tried it earlier but no difference. I'll recheck after dinner just in case I did something wrong

forslund commented 8 years ago

@zeehio, retried it and I'm afraid it doesn't make any difference =(

zeehio commented 8 years ago

Feel free to try the pull request. In my machine it works on all the given examples.

forslund commented 8 years ago

Closed by #23