huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.08k stars 879 forks source link

Slow Stable Diffusion #735

Open notdanilo opened 1 year ago

notdanilo commented 1 year ago

Stable Diffusion is super slow. It more than 30 seconds to generate an example image with the default configuration with a RTX 4090 and CUDA enabled while it would take less than 5 seconds with diffusers.

LaurentMazare commented 1 year ago

Are you using cudnn and flash-attention? If not these are likely to speed up the generation massively, you can turn them on via --features cudnn,flash-attn and use the --use-flash-attn argument (note that this is pretty slow to compile and you may want to set the CANDLE_FLASH_ATTN_BUILD_DIR environment variable to ensure that it's not recompiled too often.

notdanilo commented 1 year ago

I need some help here. I am failing to build the 'flash-attn'. I just installed cudnn-windows-x86_64-8.9.4.25_cuda12-archive on the NVIDIA GPU Computing Toolkit\CUDA\v12.2 folder (i.e. copied bin, include and lib). But I am facing lots of errors like these:

  [... many others rerun-if-changed defs above. Just showing relevant info bellow]
  cargo:rerun-if-changed=kernels/static_switch.h
  cargo:rustc-env=CUDA_INCLUDE_DIR=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include
  cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP
  cargo:rustc-env=CUDA_COMPUTE_CAP=sm_86

  cutlass/include\cute/numeric/math.hpp(299): error: identifier "not" is undefined
              typename std::enable_if<(not std::is_unsigned<T>::value)>::type* = nullptr>
                                       ^

  cutlass/include\cute/numeric/math.hpp(299): error: expected a ")"
              typename std::enable_if<(not std::is_unsigned<T>::value)>::type* = nullptr>
                                           ^

  cutlass/include\cute/numeric/math.hpp(299): error: expected a "," or ">"
              typename std::enable_if<(not std::is_unsigned<T>::value)>::type* = nullptr>
                                                                     ^

  cutlass/include\cute/numeric/math.hpp(299): error: the global scope has no "type"
              typename std::enable_if<(not std::is_unsigned<T>::value)>::type* = nullptr>
                                                                         ^
LaurentMazare commented 1 year ago

Googling for this actual error, I came across this issue, in a nutshell flash-attn-v2 doesn't seem to support building on windows at the moment because of cutlass.