coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
35.08k stars 4.27k forks source link

[Bug] Compute attention masks broken #2287

Closed shivammehta25 closed 1 year ago

shivammehta25 commented 1 year ago

Describe the bug

Hello!

To train fast speech, and fast pitch models we need to get external aligners which we can get by generating attention masks from an already trained model. But the script gets various import errors mentioned below at lines.

https://github.com/coqui-ai/TTS/blob/14d45b53470d862d4df1966d3984ef883077aa5c/TTS/bin/compute_attention_masks.py#L12

https://github.com/coqui-ai/TTS/blob/14d45b53470d862d4df1966d3984ef883077aa5c/TTS/bin/compute_attention_masks.py#L14

And wherever these imports are called in the file. And these are not being caught by CI because they commented in the training scripts of fast speech and fast pitch instead of having their own dedicated tests.

To Reproduce

python TTS/bin/compute_attention_masks.py --model_path {model_path} --config_path {config_path} --dataset ljspeech --dataset_metafile metadata.csv --data_path data/LJSpeech-1.1/  --use_cuda true"

Expected behavior

It should compute the alignments

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA GeForce RTX 3090",
        ],
        "available": true,
        "version": "11.7"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "1.13.0+cu117",
        "TTS": "0.10.2",
        "numpy": "1.21.6"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.9.15",
        "version": "#112-Ubuntu SMP Thu Feb 3 13:50:55 UTC 2022"
    }
}

Additional context

It uses some old outdated functions, maybe we can also add some tests for this file in itself?

erogol commented 1 year ago

Yeah, we don't test it and haven't touched it for a while. Since we mostly train the aligner with the model we don't need it much.

shivammehta25 commented 1 year ago

Right! I guess then it makes sense, I thought that the alignments are generated with this and then trained but I was wrong, should I then close this or make a pull request to fix this?

erogol commented 1 year ago

It was mostly for debugging purposes. If it is useful for you, feel free to send a PR or leave it be for the next victim :)

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.