Closed vatsal2473 closed 2 months ago
No, it should take time because benchmark_image.py
will include compilation since you have enabled --compile
.
To be able to reuse the compiled model, you should incorporate caching: https://pytorch.org/tutorials/recipes/torch_compile_caching_tutorial.html
I am running the benchmark_image.py script with the following command:
python benchmark_image.py --quantization fp8dqrow --compile
I have noticed that the
run_inference
function takes a very long time (more than 5 minutes) to generate the image. Shouldn't the function complete in 3-5 seconds? Please let me know if I am doing something wrong. Thanks in advance!this is my server configuration