[Bug] Cant run any of the xtts models using the TTS Command Line Interface (CLI)

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Mozilla Public License 2.0

35.26k stars 4.3k forks source link

Describe the bug

Hello I just started playing with the TTS library and I am running tests using the TTS Command Line Interface (CLI). I was able to try capacitron, vits (english and portuguese) and tacotron2 successfully. But when I tried any of the xtts models, I get the same error that suggests I have yet to set a language option.

To Reproduce

I tried running the following and it issues the error

tts --text "Welcome. This is a TTS test." --model_name "tts_models/multilingual/multi-dataset/xtts_v2" --language en --out_path TTS_english_test_xtts_output2.wav

tts --text "Welcome. This is a TTS test." --model_name "tts_models/multilingual/multi-dataset/xtts_v1.1" --language en --out_path TTS_english_test_xtts_output2.wav

I tried these commands on multiple systems yet I get the same error AssertionError: ❗ Language None is not supported. Supported languages are ['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja']

Expected behavior

No response

Logs

No response

Environment

- TTS installed from pip install TTS
- Linux OS

Additional context

My guess is that --language en is ignored and perhaps the xtts_v2 and xtts_v1.1 models are required to run in Python? I wanted to try a multilingual model through the command line interface (CLI) are there any missing steps I am missing here?

I was able to run bark using

tts --text "Welcome. This is a TTS test." --model_name "tts_models/multilingual/multi-dataset/bark" --language en --out_path TTS_english_test_bark_output2.wav

~$ tts --help usage: tts [-h] [--list_models [LIST_MODELS]] [--model_info_by_idx MODEL_INFO_BY_IDX] [--model_info_by_name MODEL_INFO_BY_NAME] [--text TEXT] [--model_name MODEL_NAME] [--vocoder_name VOCODER_NAME] [--config_path CONFIG_PATH] [--model_path MODEL_PATH] [--out_path OUT_PATH] [--use_cuda USE_CUDA] [--device DEVICE] [--vocoder_path VOCODER_PATH] [--vocoder_config_path VOCODER_CONFIG_PATH] [--encoder_path ENCODER_PATH] [--encoder_config_path ENCODER_CONFIG_PATH] [--cs_model CS_MODEL] [--emotion EMOTION] [--language LANGUAGE] [--pipe_out [PIPE_OUT]] [--speed SPEED] [--speakers_file_path SPEAKERS_FILE_PATH] [--language_ids_file_path LANGUAGE_IDS_FILE_PATH] [--speaker_idx SPEAKER_IDX] [--language_idx LANGUAGE_IDX] [--speaker_wav SPEAKER_WAV [SPEAKER_WAV ...]] [--gst_style GST_STYLE] [--capacitron_style_wav CAPACITRON_STYLE_WAV] [--capacitron_style_text CAPACITRON_STYLE_TEXT] [--list_speaker_idxs [LIST_SPEAKER_IDXS]] [--list_language_idxs [LIST_LANGUAGE_IDXS]] [--save_spectogram SAVE_SPECTOGRAM] [--reference_wav REFERENCE_WAV] [--reference_speaker_idx REFERENCE_SPEAKER_IDX] [--progress_bar PROGRESS_BAR] [--source_wav SOURCE_WAV] [--target_wav TARGET_WAV] [--voice_dir VOICE_DIR]

coqui-ai / TTS