MycroftAI / mimic1

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
https://mimic.mycroft.ai
Other
826 stars 153 forks source link

Missing coefficients in Alan Pope voice #132

Open zeehio opened 7 years ago

zeehio commented 7 years ago

The mycroft voice should provide 48 mixed excitation coefficients according to: https://github.com/MycroftAI/mimic/blob/development/lang/vid_gb_ap/vid_gb_ap_cg.c#L598

However the vid_gb_ap_me_h variable only lists 47 of them. Is it possible that when vocalid generated that model they accidentally skipped the last coefficient?

In simpler words: In each of these code lines there are 47 numbers, and I believe there should be 48.

https://github.com/MycroftAI/mimic/blob/development/lang/vid_gb_ap/vid_gb_ap_cg.c#L548 https://github.com/MycroftAI/mimic/blob/development/lang/vid_gb_ap/vid_gb_ap_cg.c#L551 https://github.com/MycroftAI/mimic/blob/development/lang/vid_gb_ap/vid_gb_ap_cg.c#L554 https://github.com/MycroftAI/mimic/blob/development/lang/vid_gb_ap/vid_gb_ap_cg.c#L557 https://github.com/MycroftAI/mimic/blob/development/lang/vid_gb_ap/vid_gb_ap_cg.c#L560

Pinging @m-toman because he is working and vocalid and he might help with this? Pinging @aatchison and @forslund because they submitted the voice on the first place

For now I will set those missing numbers to zero, but if any of you can help to find the actual values I would appreciate it.

[Irrelevant background story] Finding this was one of hardest bugs I have faced. It all started because I was building the mycroft voice with both autotools and the meson build system (see https://github.com/zeehio/mimic-vid_gb_ap/tree/development ). Both build systems compiled without warnings, but only autotools was working. With meson I just noise. I was going crazy to understand why. In the end I found out it only happened with the mycroft voice, so I dumped a .flitevox voice from the mycroft voice with both build systems, and I compared them byte per byte. In the meson case, I found a weird "vid_gb_a" string that raised my "garbage-is-being-written alarm" because of the missing "p" (usually is "vid_gb_ap").

forslund commented 7 years ago

I wonder if the issue we had on Raspberry Pi back in January/February was really caused by this? I could faintly hear the words behind a ton of noise. I'll see if I can find the original delivery and see any differences. I know I renamed a lot of things but I never touched the data

forslund commented 7 years ago

I found the original delivery and it's only got 47 coefficients (and there should be 48).

m-toman commented 7 years ago

Hi,

usually the order of the coefficients is (coeff count - 1)... but actually we have completely different numbers in our scripts... I'll look into this. I've parameterized that huge mess of Festvox, because all those coefficient orders are hardcoded originally. Perhaps I missed a critical occurrence... but still wouldn't explain why it would produce 47 instead of 48.

EDIT: At least I found where it seems to come from: https://github.com/MycroftAI/mimic/blob/development/tools/make_cg.scm#L248 and https://github.com/MycroftAI/mimic/blob/development/tools/make_cg.scm#L326

Oh my..

EDIT2: @zeehio So seems the buffer overflow happens here: https://github.com/MycroftAI/mimic/blob/development/src/cg/cst_mlsa.c#L275 But I wonder why this wouldn't happen with other clustergen voices, because the order=48 and < o 46 have been hardcoded in flite itself already.

EDIT3: Will try to fix it and run a few experiments. Will take some time and then I'll get back to you.

zeehio commented 7 years ago

Thanks for the fast response!

@forslund, you are right... I had forgotten about that issue :-)

@m-toman The reason why the other voices are not affected is that the bug you have found at make_cg.scm is present on flite-2.0.0 but it is not present on flite-1.4 (I just checked). The other voices were probably built with Flite 1.4 or before.

Looking forward to your findings :-)

zeehio commented 7 years ago

By the way, none of the voice files for the mycroft voice have a copyright assigned nor license terms. I asked @penrods a while ago about them and he told me he had to check that out. Do you (@m-toman) know the license terms?

m-toman commented 7 years ago

I wonder why they changed that from 47 to 46 in the new version. Just re-exporting would probably be enough to quickly test it, but to be sure I do a few re-trainings with our current scripts and a parameterized version of make_cg.

No, unfortunately I have no idea about the licensing and the details of the deal between VocaliD and Mycroft. Update: Rupal said she will contact Joshua, and you can also contact her directly regarding licensing (rupal@vocalid.co).