When running text-to-speech on an english model, when tts tries to write the .wav file, it runs out of memory. I'm running on cpu only. My machine has ~14GB available RAM
I ran the code on around 20 pages of text, everything worked before tts.tts_to_file, but then it threw runtimeError bad allocation. During inference the model was successfully swapping chunks in and out of memory but when trying to write the file, it looks like it ran out of memory.
It works fine on a few paragraphs.
To Reproduce
from TTS.api import TTS
# set device
device = "cpu"
txt_20_pages = "copyrighted text, substitute with 500*20 words"
# Init TTS with the target model name
tts = TTS(model_name="tts_models/de/thorsten/tacotron2-DDC", progress_bar=False).to(device)
# Run TTS
tts.tts_to_file(text=txt_20_pages, file_path="long_voice.wav")
Expected behavior
Writing the .wav file successfully
Logs
Traceback (most recent call last):
File "C:\Users\Zapi\Documents\spe2.py", line 284, in <module>
tts.tts_to_file(text=txt2, file_path="gard_book_ich1.wav")
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\api.py", line 334, in tts_to_file
wav = self.tts(
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\api.py", line 276, in tts
wav = self.synthesizer.tts(
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\utils\synthesizer.py", line 398, in tts
outputs = synthesis(
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\tts\utils\synthesis.py", line 221, in synthesis
outputs = run_model_torch(
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\tts\utils\synthesis.py", line 53, in run_model_torch
outputs = _func(
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\tts\models\vits.py", line 1161, in inference
o = self.waveform_decoder((z * y_mask)[:, :, : self.max_inference_len], g=g)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\vocoder\models\hifigan_generator.py", line 254, in forward
o = self.ups[i](o)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\conv.py", line 797, in forward
return F.conv_transpose1d(
RuntimeError: bad allocation
Describe the bug
When running text-to-speech on an english model, when tts tries to write the .wav file, it runs out of memory. I'm running on cpu only. My machine has ~14GB available RAM
I ran the code on around 20 pages of text, everything worked before tts.tts_to_file, but then it threw runtimeError bad allocation. During inference the model was successfully swapping chunks in and out of memory but when trying to write the file, it looks like it ran out of memory.
It works fine on a few paragraphs.
To Reproduce
Expected behavior
Writing the .wav file successfully
Logs
Environment
Additional context
This is happening on 16GB of RAM so if you have more ram when testing it might not happen. Limit in a VM should be able to do it.