coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
34.7k stars 4.21k forks source link

[Bug] data loader initialization hangs when training vits with more than ~22k wav files. #2256

Closed padmalcom closed 1 year ago

padmalcom commented 1 year ago

Describe the bug

When I train a custom vits model with more than approx. 22.000 training files, the data loader initialization step hangs (or takes very, very long). Training with about 20.000 files takes some seconds for the loader, training with 30.000 files seems to freeze the loader (I waited about 30 hours). I tested the behavior with 800k, 50k, 30k and 25k files.

Is this behavior known? What could be the issue here and how can I fix it?

> DataLoader initialization
| > Tokenizer:
    | > add_blank: True
    | > use_eos_bos: False
    | > use_phonemes: True
    | > phonemizer:
        | > phoneme language: de-de
        | > phoneme backend: gruut
    | > 10 not found characters:
    | > ̯
    | > ͡
    | > “
    | > õ
    | > ã
    | > …
    | > ̃
    | > «
    | > »
    | > ”
| > Number of instances : 29700

To Reproduce

Train a vits model on an LJSpeech formatted dataset with more than 22k wav files.

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "Quadro RTX 6000",
            "Quadro RTX 6000",
            "Quadro RTX 6000",
            "Quadro RTX 6000",
            "Quadro RTX 6000",
            "Quadro RTX 6000"
        ],
        "available": true,
        "version": "11.3"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "1.12.1+cu113",
        "TTS": "0.10.1",
        "numpy": "1.21.6"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.9.15",
        "version": "#62~20.04.1-Ubuntu SMP Tue Nov 22 21:24:20 UTC 2022"
    }
}

Additional context

No response

erogol commented 1 year ago

We don't use a multi-GPU setup. Unfortunately, someone from the community should deal with that.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

padmalcom commented 1 year ago

Then answer by @erogol clearifies why the data loader does not what I expected to do.