coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
31.64k stars 3.78k forks source link

[Bug] YourTTS alignment is weird #3736

Open thivux opened 1 month ago

thivux commented 1 month ago

Describe the bug

I am training YourTTS model on VCTK using the recipe provided in this repo. But the alignment seems off, and the audio output is not great, as it often skips over words.

Below is the alignment plot. I tried both using VITS checkpoint trained on LJSpeech for 1M steps (like in the paper), and starting from scratch, but the alignment is still the same. Can someone confirm if the current code for training YourTTS is working or not? Thanks. image

eval losses: image

To Reproduce

Expected behavior

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "A100-SXM4-40GB"
        ],
        "available": true,
        "version": "11.8"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.1",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.9.18",
        "version": "#152-Ubuntu SMP Wed Nov 23 20:19:22 UTC 2022"
    }
}

Additional context

No response

stale[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.