facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.
https://facebookresearch.github.io/xformers/
Other
8.66k stars 614 forks source link

WHY Performance regression with xformers 0.27.post2 + cuda 2.4.0+cu121 on old NVIDIA GPU? #1075

Closed motolo closed 3 months ago

motolo commented 3 months ago

❓ Questions and Help

On my 1060+6gb vram card I noticed a significant drop in performance during the generation phase with COMFYUI despite everything being perfectly configured, same operating temperatures, same vram used, same frequencies in the GPU.

I can understand that my video card is old but going from 17 seconds for a generation with the default workflow to 1:21 indicates that there is something wrong both in xformers and in pytorch.

Do you have any idea why all this could happen? Schermata_20240730_114113

lw commented 3 months ago

Could you elaborate on why you believe this issue is caused by xFormers?

motolo commented 3 months ago

Because even with version 2.31 with xformers 0.27 I have the exact same problem.

motolo commented 3 months ago

It could also be a combination of both factors...I ask you for help or to know if anyone else has had the same problem.

motolo commented 3 months ago

Found the solution....update cuda to version 12.4 :) remove force fp16 and replace it with force fp32 :)