I'm attempting to use the tortoise-v2 model for inference.
All I've done so far:
git clone https://github.com/coqui-ai/TTS
cd TTS
pip install -e .[all]
tts --text "Text for TTS" --model_name "tts_models/en/multi-dataset/tortoise-v2" --out_path speech.wav
I got this error on the first attempt:
> Downloading model to /home/vscode/.local/share/tts/tts_models--en--multi-dataset--tortoise-v2
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.72G/1.72G [00:37<00:00, 45.7MiB/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 976M/976M [00:21<00:00, 45.8MiB/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 151M/151M [00:04<00:00, 32.1MiB/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.17G/1.17G [00:25<00:00, 46.0MiB/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25.2M/25.2M [00:01<00:00, 17.0MiB/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 101M/101M [00:03<00:00, 28.2MiB/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 391M/391M [00:09<00:00, 41.7MiB/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.07k/1.07k [00:01<00:00, 868iB/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.40k/4.40k [00:00<00:00, 700kiB/s]
> Model's license - apache 2.0
> Check https://choosealicense.com/licenses/apache-2.0/ for more info.
> Using model: tortoise
Downloading (…)lve/main/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████| 2.11k/2.11k [00:00<00:00, 20.7MB/s]
Downloading pytorch_model.bin: 100%|████████████████████████████████████████████████████████████████████████████████████████| 1.26G/1.26G [00:27<00:00, 46.2MB/s]
Traceback (most recent call last):
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
response.raise_for_status()
File "/home/vscode/.local/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://huggingface.co/facebook/wav2vec2-large-960h/resolve/main/preprocessor_config.json
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/utils/hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1195, in hf_hub_download
metadata = get_hf_file_metadata(
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1541, in get_hf_file_metadata
hf_raise_for_status(r)
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 301, in hf_raise_for_status
raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 502 Server Error: Bad Gateway for url: https://huggingface.co/facebook/wav2vec2-large-960h/resolve/main/preprocessor_config.json
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/python/current/bin/tts", line 8, in <module>
sys.exit(main())
File "/workspaces/ai/TTS/TTS/bin/synthesize.py", line 385, in main
synthesizer = Synthesizer(
File "/workspaces/ai/TTS/TTS/utils/synthesizer.py", line 101, in __init__
self._load_tts_from_dir(model_dir, use_cuda)
File "/workspaces/ai/TTS/TTS/utils/synthesizer.py", line 144, in _load_tts_from_dir
self.tts_model = setup_tts_model(config)
File "/workspaces/ai/TTS/TTS/tts/models/__init__.py", line 13, in setup_model
model = MyModel.init_from_config(config=config, samples=samples)
File "/workspaces/ai/TTS/TTS/tts/models/tortoise.py", line 838, in init_from_config
return Tortoise(config)
File "/workspaces/ai/TTS/TTS/tts/models/tortoise.py", line 336, in __init__
self.aligner = Wav2VecAlignment()
File "/workspaces/ai/TTS/TTS/tts/layers/tortoise/wav2vec_alignment.py", line 51, in __init__
self.feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/wav2vec2-large-960h")
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", line 330, in from_pretrained
feature_extractor_dict, kwargs = cls.get_feature_extractor_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", line 430, in get_feature_extractor_dict
resolved_feature_extractor_file = cached_file(
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/utils/hub.py", line 475, in cached_file
raise EnvironmentError(f"There was a specific connection error when trying to load {path_or_repo_id}:\n{err}")
OSError: There was a specific connection error when trying to load facebook/wav2vec2-large-960h:
502 Server Error: Bad Gateway for url: https://huggingface.co/facebook/wav2vec2-large-960h/resolve/main/preprocessor_config.json
Please can you help me with getting this working?
Thanks
To Reproduce
I ran the command once again to see if it would download the preprocessor_config.json successfully. But still has this error:
vscode ➜ /workspaces/ai/TTS (dev) $ tts --text "Text for TTS" --model_name "tts_models/en/multi-dataset/tortoise-v2" --out_path speech.wav
> tts_models/en/multi-dataset/tortoise-v2 is already downloaded.
> Model's license - apache 2.0
> Check https://choosealicense.com/licenses/apache-2.0/ for more info.
> Using model: tortoise
Downloading (…)rocessor_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████| 159/159 [00:00<00:00, 1.65MB/s]
Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████| 85.0/85.0 [00:00<00:00, 859kB/s]
Traceback (most recent call last):
File "/usr/local/python/current/bin/tts", line 8, in <module>
sys.exit(main())
File "/workspaces/ai/TTS/TTS/bin/synthesize.py", line 385, in main
synthesizer = Synthesizer(
File "/workspaces/ai/TTS/TTS/utils/synthesizer.py", line 101, in __init__
self._load_tts_from_dir(model_dir, use_cuda)
File "/workspaces/ai/TTS/TTS/utils/synthesizer.py", line 144, in _load_tts_from_dir
self.tts_model = setup_tts_model(config)
File "/workspaces/ai/TTS/TTS/tts/models/__init__.py", line 13, in setup_model
model = MyModel.init_from_config(config=config, samples=samples)
File "/workspaces/ai/TTS/TTS/tts/models/tortoise.py", line 838, in init_from_config
return Tortoise(config)
File "/workspaces/ai/TTS/TTS/tts/models/tortoise.py", line 336, in __init__
self.aligner = Wav2VecAlignment()
File "/workspaces/ai/TTS/TTS/tts/layers/tortoise/wav2vec_alignment.py", line 52, in __init__
self.tokenizer = Wav2Vec2CTCTokenizer.from_pretrained("jbetker/tacotron-symbols")
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1812, in from_pretrained
return cls._from_pretrained(
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1975, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/usr/local/python/3.10.11/lib/python3.10/site-packages/transformers/models/wav2vec2/tokenization_wav2vec2.py", line 189, in __init__
with open(vocab_file, encoding="utf-8") as vocab_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType
Expected behavior
No response
Logs
No response
Environment
- 🐸TTS version v0.14.0
- No GPU
- Running this inside a Docker Engine Ubuntu 22.04 container.
Describe the bug
Hi,
I'm attempting to use the tortoise-v2 model for inference.
All I've done so far:
I got this error on the first attempt:
Please can you help me with getting this working?
Thanks
To Reproduce
I ran the command once again to see if it would download the preprocessor_config.json successfully. But still has this error:
Expected behavior
No response
Logs
No response
Environment
Additional context
No response