Open nicolabortignon opened 1 year ago
Have you tried with GPU instead of CPU?
For me specifically, I'm running on an M1 Ultra, and GPU (cuda) would not work. I'm trying to get MPS to work for this codebase, but haven't succeed just yet.
I would like to use tortoise for very long rendering, so anything that I can cut, is helpful, even in a GPU context.
Did you ever get this working with MPS @nicolabortignon ? I’m just about to look at it myself.
Regarding using MPS. There is a problem that the transformers library internally uses the function torch.topk(). This is not supported on MPS for top_k > 16. When I tried to send this to the CPU, Python complained that tensors were found on two different devices. Anyone know of a workaround for this?
site-packages/transformers/generation_logits_process.py", line 236, in call indices_to_remove = scores < torch.topk(scores.to("cpu"), top_k)[0][..., -1, None] RuntimeError: Expected all tensors to be on the same device, but found at least two devices, mps:0 and cpu!
i changed "cuda" -> "cpu"
https://github.com/seohyunjun/tortoise-tts
recommend using this one
Will give it a try on my M1 Max, could the same changes be easily applied to the tortoise-tts-fast version as this has the advantage of a GUI.
i don't recommend using gpu with torch, because mps doesn't support fft_r2c
.
so you met fft (fast-furier-transform) error. (mps weak calculate complex type)
[current mps issue] https://github.com/pytorch/pytorch/issues/77764
someday it will fixed.. i hope ..
I hope this helps. 😢
I've just started looking into Tortoise. Impressive body of work.
Just by reading the wip paper, it's clear to me there is soo much under the hood to tweak and play with.
As I'd prefer to continue exploring it locally, I want to figure out a way to reduce a bit the inference time. I was wondering if anyone here have had thoughts on how to reduce the computational time for the autoregression and candidate selection step. For instance:
Any other thoughts?