Closed kvrban closed 1 week ago
hey @kvrban thanks for pointing it out; we'll take a look and update when it's fixed.
Hi @kvrban , can you try with a T4 GPU on colab? It would appear this is a CPU-only bug with autocast. We'll have a fix for CPU-only inference and a few other CPU-friendly fixes soon.
Yeah, on T4 GPU i does run.
https://github.com/Camb-ai/MARS5-TTS/assets/33060804/48e66002-a3e7-47a2-8843-d610fbedf918
but Synthesized output audio is just a hum sound. (had to convert the orginal wav to a mp4 to uload here)
mars5_demo.mp4 but Synthesized output audio is just a hum sound. (had to convert the orginal wav to a mp4 to uload here)
try a different set of params:
cfg = config_class(deep_clone=deep_clone, rep_penalty_window=100, top_p=0.8, temperature=1.0, freq_penalty=3)
The issue:
RuntimeError Traceback (most recent call last) [<ipython-input-7-2d05018561f0>](https://localhost:8080/#) in <cell line: 6>() 4 top_k=100, temperature=0.7, freq_penalty=3) 5 ----> 6 ar_codes, wav_out = mars5.tts("The quick brown rat.", wav, 7 ref_transcript, 8 cfg=cfg) 13 frames [~/.cache/torch/hub/Camb-ai_mars5-tts_master/mars5/nn_future.py](https://localhost:8080/#) in forward(self, x, freqs_cis, positions, mask, cache) 249 scatter_pos = (positions[-self.sliding_window:] % self.sliding_window)[None, :, None, None] 250 scatter_pos = scatter_pos.repeat(bsz, 1, self.n_kv_heads, self.args.head_dim) --> 251 cache.cache_k[:bsz].scatter_(dim=1, index=scatter_pos, src=xk[:, -self.sliding_window:]) 252 cache.cache_v[:bsz].scatter_(dim=1, index=scatter_pos, src=xv[:, -self.sliding_window:]) 253 RuntimeError: scatter(): Expected self.dtype to be equal to src.dtypeRuntimeError Traceback (most recent call last) [<ipython-input-7-2d05018561f0>](https://localhost:8080/#) in <cell line: 6>() 4 top_k=100, temperature=0.7, freq_penalty=3) 5 ----> 6 ar_codes, wav_out = mars5.tts("The quick brown rat.", wav, 7 ref_transcript, 8 cfg=cfg) 13 frames [~/.cache/torch/hub/Camb-ai_mars5-tts_master/mars5/nn_future.py](https://localhost:8080/#) in forward(self, x, freqs_cis, positions, mask, cache) 249 scatter_pos = (positions[-self.sliding_window:] % self.sliding_window)[None, :, None, None] 250 scatter_pos = scatter_pos.repeat(bsz, 1, self.n_kv_heads, self.args.head_dim) --> 251 cache.cache_k[:bsz].scatter_(dim=1, index=scatter_pos, src=xk[:, -self.sliding_window:]) 252 cache.cache_v[:bsz].scatter_(dim=1, index=scatter_pos, src=xv[:, -self.sliding_window:]) 253 RuntimeError: scatter(): Expected self.dtype to be equal to src.dtype
Hi @kvrban, I'm facing a similar issue on a MacBook M3 Pro. Is there a fix for this?
@kvrban @origin-s20
We've merged a fix for this. For it to take effect, you may need to delete your torch hub cache before trying it again, e.g.:
rm -rf ~/.cache/torch/hub/Camb-ai_mars5-tts_master
Or simply add force_reload=True
to torch.hub.load
call.
Please note that this is a CPU only bug and that inference on CPU will be quite slow.
@kvrban @origin-s20 We've merged a fix for this. For it to take effect, you may need to delete your torch hub cache before trying it again, e.g.:
rm -rf ~/.cache/torch/hub/Camb-ai_mars5-tts_master
Or simply addforce_reload=True
totorch.hub.load
call. Please note that this is a CPU only bug and that inference on CPU will be quite slow.
@pieterscholtz thanks it worked!
The issue: