When using the text_to_audio_playback.py script in the Examples folder I get this error...
Failed to load the S2A model:
Traceback (most recent call last):
File "C:\PATH\Scripts\WhisperSpeech_working\Lib\site-packages\whisperspeech\pipeline.py", line 59, in __init__
self.s2a = SADelARTransformer.load_model(**args, device=device) # use obtained compute device
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\PATH\Scripts\WhisperSpeech_working\Lib\site-packages\whisperspeech\s2a_delar_mup_wds_mlang.py", line 415, in load_model
model = cls(**spec['config'], tunables=Tunables(**Tunables.upgrade(spec['tunables'])))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Tunables.__init__() got an unexpected keyword argument 'force_hidden_to_emb'
C:\PATH\Scripts\WhisperSpeech_working\Lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Exception in thread Thread-1 (process_text_to_audio):---------------------------------| 19.92% [149/748 00:01<00:04]
Traceback (most recent call last):
File "C:\Users\Airflow\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 1045, in _bootstrap_inner
self.run()
File "C:\Users\Airflow\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "C:\PATH\Scripts\WhisperSpeech_working\text_to_audio_playback.py", line 45, in process_text_to_audio
audio_tensor = pipe.generate(sentence) # Generate audio tensor for the sentence
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\PATH\Scripts\WhisperSpeech_working\Lib\site-packages\whisperspeech\pipeline.py", line 100, in generate
return self.vocoder.decode(self.generate_atoks(text, speaker, lang=lang, cps=cps, step_callback=step_callback))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\PATH\Scripts\WhisperSpeech_working\Lib\site-packages\whisperspeech\pipeline.py", line 96, in generate_atoks
atoks = self.s2a.generate(stoks, speaker.unsqueeze(0), step=step_callback)
The strange thing is that when I use the "q4-base-en" and other models I don't get this...I tried it using Pytorch 2.1.2 and 2.2.0...
When using the
text_to_audio_playback.py
script in the Examples folder I get this error...The strange thing is that when I use the "q4-base-en" and other models I don't get this...I tried it using Pytorch 2.1.2 and 2.2.0...