sayakpaul / diffusers-torchao

End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).
Apache License 2.0
274 stars 8 forks source link

run_inference function taking too long time #28

Closed vatsal2473 closed 2 months ago

vatsal2473 commented 2 months ago

I am running the benchmark_image.py script with the following command:

python benchmark_image.py --quantization fp8dqrow --compile

I have noticed that the run_inference function takes a very long time (more than 5 minutes) to generate the image. Shouldn't the function complete in 3-5 seconds? Please let me know if I am doing something wrong. Thanks in advance!

this is my server configuration

Screenshot from 2024-09-12 22-24-09

sayakpaul commented 2 months ago

No, it should take time because benchmark_image.py will include compilation since you have enabled --compile.

To be able to reuse the compiled model, you should incorporate caching: https://pytorch.org/tutorials/recipes/torch_compile_caching_tutorial.html