coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
34.68k stars 4.21k forks source link

[Bug] speedy_speech erroring on short inputs #3554

Open jp-x-g opened 8 months ago

jp-x-g commented 8 months ago

Describe the bug

Something in the way that it passes input into the speedy_speech model (tts_models/en/ljspeech/speedy-speech) is bugged and errors out for short inputs. It wants them to be a specific length. I've only tested this for single-sentence input, I don't know what it does for other types of input.

To Reproduce


tts --text "Test" --model_name tts_models/en/ljspeech/speedy-speech --out_path speedy-speech-test.wav

This returns a truly giant stacktrace, terminating in /.local/lib/python3.11/site-packages/torch/nn/modules/conv.py, which gives: RuntimeError: Calculated padded input size per channel: (4). Kernel size: (7). Kernel size can't be greater than actual input size

Changing the input string to "Testino" (7 characters) or "Testy westy" (11 characters) gives the same stacktrace: RuntimeError: Calculated padded input size per channel: (11). Kernel size: (13). Kernel size can't be greater than actual input size Adding two characters fixes this, but they can't be whitespace

Expected behavior

I am not sure what the optimal method of fixing this is. What I'd do, without any knowledge of what's going on under the hood, is just figure out what length of string it wants (it seems like 13 is the minimum) and just pad out all short speedy_speech inputs with ellipses to get to 13. This is probably a bad idea, and there's probably a better way of doing it.

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.2+cu121",
        "TTS": "0.22.0",
        "numpy": "1.24.4"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "",
        "python": "3.11.6",
        "version": "#1 SMP PREEMPT_DYNAMIC Wed Oct 11 04:07:58 UTC 2023"
    }
}

Additional context

No response

stale[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.