LaurentMazare / diffusers-rs

An implementation of the diffusers api in Rust
Apache License 2.0
535 stars 55 forks source link

Same device on DDPMScheduler #48

Closed mspronesti closed 1 year ago

mspronesti commented 1 year ago

Now that I have the chance to use the GPU to run diffusion experiments I noticed one of the schedulers I implemented (DDPMScheduler) performs operations on tensors on different devices. This PR fixes it. I double checked all the other schedulers are sound from this point of view.

LaurentMazare commented 1 year ago

Thanks, out of curiosity what GPU do you end up using? How fast does the process run there compared to your cpu?

mspronesti commented 1 year ago

Thanks for merging! I used the environment I set up here #46. Compared to my CPU, it is around 45x faster, ignoring the time of the first compilation. I repeated the experiment 10 times using the DDIM scheduler with 50 inference steps:

Hardware avg inference time
Intel Xeon CPU @2.20 GHz 24 min
NVIDIA T4 Tensor Core GPU 32.4 s

As for the python version, it seems comparable. I was wondering a couple of days ago whether working with Vec<_>s instead of tch::Tensors slows down a little the inference process (even if we only access them, we don't perform operations with timesteps or sigmas and when we do we access data that we combine with Tensors)

mspronesti commented 1 year ago

Updated the above reply with some numbers and some more details on the experimental setup.

I also noticed all the checks failed when you merged this pull request. I'm a little surprised, because they all succeeded when I opened it. I mirrored the repo and re-run all the jobs. After the second attempt, 6 more passed. On the third re-run, they all passed. Perhaps some recent release of the actions broke something ?