MycroftAI / mimic1

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
https://mimic.mycroft.ai
Other
817 stars 153 forks source link

Voice list management #33

Open m-toman opened 8 years ago

m-toman commented 8 years ago

When not compiling a voice with the system but using the load-functionality (e.g. using cst_cg_load_voice), the global list mimic_voice_list does not work intuitively.

Voice loading can be seen in mimic_main.c:

    if (mimic_voice_list == NULL)
        mimic_set_voice_list(voicedir);
    if (desired_voice == 0)
        desired_voice = mimic_voice_select(NULL);

mimic_voice_select searches the voice list and if it is not found, it tries to load the voice from file - but not if the list is NULL/empty in the first place. In that case mimic_voice_select returns NULL. Therefore in the example tool mimic_main.c, the global mimic_voice_list is directly accessed.

A possible workaround to load the voice is to ignore the lists and use:

   mimic_init();
   mimic_add_lang("eng",usenglish_init,cmu_lex_init);
   ...
   voice = mimic_voice_load(path);

Probably mimic_voice_select could be modified so that it also works with empty lists too. Also, there are no functions to remove a voice from a list and delete it.

forslund commented 8 years ago

Disclamer: I still don't quite understand all the ins and outs of the system so I might be 100% wrong.

Firstly: You are right in that mimic_voice_select() should be able to handle the case of empty list and fall back to a file. Modifying it to allow for an empty voice list shouldn't be much effort at all.

The main reason (as far as I can see) for the set_voice_list() call is to build the different preselected voice executables. I assume this is to make the software easy to use with smaller systems (read embedded devices).

As the voices are linked into the executable (as far as I understand) it doesn't make much sense to remove a voice from the list since it's already compiled and included in the binary. I guess this is the reason there is no remove/delete methods. Do you have any concrete example for when removing a voice might be beneficial?

Should we add an executable called mimic_none which contains no voice list and requires -voice parameter to work?

m-toman commented 8 years ago

I also don't understand it completely yet, but let's try to :):

I just took a look at tools/make_voice_list- so you can use this tool to create a static voice list given multiple parameters with voice names.

Looking at cmu_us_kal: There is register_cmu_us_kal which sets the global cst_voice* cmu_us_kal_diphone and also returns it. There is also unregister_cmu_us_kal which calls delete_voice, which frees memory and unsets cmu_us_kal_diphone. Both don't seem to touch the voice list.

Who calls register for the voices in the list? And when you unregister a voice, access using mimic_voice_select() will result in... ? Unregistering is probably useful when you load lot of voices from files and your application runs for a longer time (e.g. as a smartphone app).

I don't know if someone might need a mimic_none executable, personally I always integrated flite(+hts_engine) directly into other applications and directly retrieved the waveforms.

You are right, it would probably be the easiest solution to just have mimic_voice_select() deal with empty lists.

forslund commented 8 years ago

Yes learning is fun!

I see what you mean. I never looked far enough to see that the unregister functions are actually used and not just for academic purposes. I'm still not sure about their usefulness, since most of the voice data is precompiled the unregistring a voice won't do much difference. (I think)

But as long as there is use cases for the unregister case the voice list should also be updated.

For the mimic excutable the register functions are called from main() using the set voice_list() in main/mimic_voice_list.c (generated by tools/make_voice_list).

Requiring a voice list with at least one voice is terrible if you use the software as a library if you want to use voices from file!

I say this results in two tasks:

  1. allow empty voice lists
  2. handle removal of voice in a good way
forslund commented 8 years ago

I did some basic work to allow empty voice lists that can be reviewed and tested in the branch no-voice-list. I currently don't have a good example where this is used but I've updated the unit test and it seem to behave as expected.