coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
35.59k stars 4.35k forks source link

[Bug?] TTS of "10. 9. 8. 7. 6. 5. 4. 3. 2. 1. Finished" seems to clog the system #3972

Open thomasf1 opened 3 months ago

thomasf1 commented 3 months ago

Describe the bug

Trying to get TTS to do a countdown, but it seems to run forever, when a similar prompt seems to run in a reasonable time

Works as expected: tts --text "How is the weather today?" --model_name "tts_models/en/ek1/tacotron2" --out_path test2.wav

Runs forever on my system: tts --text "10. 9. 8. 7. 6. 5. 4. 3. 2. 1. Finished" --model_name "tts_models/en/ek1/tacotron2" --out_path test3.wav

To Reproduce

run tts --text "10. 9. 8. 7. 6. 5. 4. 3. 2. 1. Finished" --model_name "tts_models/en/ek1/tacotron2" --out_path test3.wav

Expected behavior

Reasonable execution time

Logs

tts --text "10. 9. 8. 7. 6. 5. 4. 3. 2. 1. Finished" --model_name "tts_models/en/ek1/tacotron2" --out_path test3.wav

 > tts_models/en/ek1/tacotron2 is already downloaded.
 > vocoder_models/en/ek1/wavegrad is already downloaded.
 > Using model: Tacotron2
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:-10
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:0
 | > fft_size:1024
 | > power:1.8
 | > preemphasis:0.99
 | > griffin_lim_iters:60
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > pitch_fmin:1.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > Model's reduction rate `r` is set to: 2
 > Vocoder Model: wavegrad
 > Text: 10. 9. 8. 7. 6. 5. 4. 3. 2. 1. Finished
 > Text splitted to sentences.
['10. 9.', '8.', '7.', '6.', '5.', '4.', '3.', '2.', '1.', 'Finished']
(still running)

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": null
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.2",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Darwin",
        "architecture": [
            "64bit",
            ""
        ],
        "processor": "arm",
        "python": "3.10.14",
        "version": "Darwin Kernel Version 23.5.0: Wed May  1 20:16:51 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8103"
    }
}

Additional context

No response

thomasf1 commented 3 months ago

Not sure if it´s a bug, but it sure as hell seems strange

stale[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.