Unable to add Custom TTS model (i.e Coqui TTS)

ghost commented 3 months ago

I was unable to add Custom TTS (i.e Coqui TTS). Tried to add model information in model.json but it doesn't seems to work, maybe I am doing it wrong. What is the procedure to add Custom TTS model in Speech Note application. Thanks for making this great app for Linux :)

mkiol commented 3 months ago

Hi. Thanks for the report.

As you probably know, you need to edit ~/.var/app/net.mkiol.SpeechNote/data/net.mkiol/dsnote/models.json file and add new entry with model configuration.

This entry should be similar to the one below.

        {
            "name": "New cool voice",
            "model_id": "en_coqui_new_cool_model",
            "engine": "tts_coqui",
            "lang_id": "en",
            "checksum": "8bc7e85b",
            "checksum_quick": "50984d2b",
            "comp": "dir",
            "urls": [
                "file:///path/to/model/config.json",
                "file:///path/to/model/model.pth"
            ],
            "size": "100827994"
        },

Few important remarks:

model_id has to be unique
If the model files are located on your local drive, use the file:// URL type.
Put URLs for every file that is needed by the model (config.json and model.pth are just an example)
If your model uses custom vocoder you need to add it in the sups sub object (example: es_coqui_tacotron_mai from models.json)
To generate checksum and checksum_quick, use --gen-checksum command line option. To do this, put empty strings in both checksum and checksum_quick, save the file and run Speech Note with --verbose --gen-checksum options

flatpak run net.mkiol.SpeechNote --verbose --gen-checksums

The model will be downloaded automatically and the checksum should appear on the terminal.

[D] 18:15:52.802230735.802 0x7709dea87d00 () - all checksums were generated
models checksums:

"model_id": "fr_coqui_css100_vits",
"checksum": "a7671b81",
"checksum_quick": "7d7531cf",
"size": "100821187",

Let me know if any of this was helpful.

ghost commented 3 months ago

Thanks, this did work but what about adding a custom multi-language model i.e fine tuned XTTS model on it? Do I have to add multiple model ids for different language the XTTS model supports?

mkiol commented 3 months ago

XTTS? Nice :)

custom multi-language model

For multilingual models you may use "model aliases". Alias is a copy of the model entry but with changed properties (like language for instance). To create alias, define new model entry with model_alias_of param. Look at the example below.

Model multilang_coqui_xtts203 is a base model. It is hidden for the user thanks to hidden : true. This "base" model is used by en_coqui_xtts203 and pt_coqui_br_xtts203 aliases.

        {
            "name": "Multilingual (Coqui XTTS-v2.0.3)",
            "model_id": "multilang_coqui_xtts203",
            "engine": "tts_coqui",
            "lang_id": "multilang",
            "checksum": "ae3c9981",
            "checksum_quick": "ce376c5d",
            "options": "xs",
            "features": [
                "tts_voice_cloning"
            ],
            "license": {
                "id": "CPML",
                "name": "Coqui Public Model License 1.0.0",
                "url": "https://coqui.ai/cpml.txt",
                "accept_required": true
            },
            "comp": "dir",
            "urls": [
                "https://huggingface.co/coqui/XTTS-v2/resolve/69d4f754575c4b72d991f105b4775d270438ef33/model.pth",
                "https://huggingface.co/coqui/XTTS-v2/resolve/69d4f754575c4b72d991f105b4775d270438ef33/config.json",
                "https://huggingface.co/coqui/XTTS-v2/resolve/69d4f754575c4b72d991f105b4775d270438ef33/vocab.json"
            ],
            "size": "1868302897",
            "hidden": true
        },
        {
            "name": "English (Coqui XTTS-v2.0.3)",
            "model_id": "en_coqui_xtts203",
            "model_alias_of": "multilang_coqui_xtts203",
            "lang_id": "en"
        },
        {
            "name": "Português brasileiro (Coqui XTTS-v2.0.3)",
            "model_id": "pt_coqui_br_xtts203",
            "model_alias_of": "multilang_coqui_xtts203",
            "lang_id": "pt"
        },

mkiol / dsnote

Unable to add Custom TTS model (i.e Coqui TTS) #123