Closed j-f1 closed 1 year ago
Potentially you could try not tracing the UNet, if that is causing the issue: https://github.com/riffusion/riffusion-inference/blob/main/riffusion/server.py#L99
I do expect it to be too slow for real time on MPS, and not tracing will slow it further
Can confirm removing the trace works. I also did my own trace, but see ~negligible difference. Perhaps because aten::repeat_interleave.self_int
is not supported on MPS.
I get ~2it/s on a Apple M1 Max (18sec image gen).
The real bottleneck appears to be wav_bytes_from_spectrogram_image
@ ~100sec. Doesn't look like MPS supports GriffinLim
. It seg faults for me, but I only gave it a brief look
Interesting about GriffinLim. There's been some talk of finding a neural vocoder that has better quality, could be something to track.
Follow up, a lot of fourier operations are not supported on MPS yet, in particular the ComplexFloat data type: https://github.com/pytorch/pytorch/issues/78044
Until that is resolved, generation could work but the entire stack will not
MPS is now supported as a device with CPU fallback for some operations. See the README for a description!
I’m not sure if current gen Apple Silicon GPUs are capable of doing the computation fast enough (probably not tbh) but it would be great to get it working so folks can at least try it out. I tried changing all the mentions of
cuda
in the project tomps
, but I’m getting an error in TensorScript which suggests some changes need to be made to the model to not assume CUDA. Is there a way to fix/patch this?