Open bbecausereasonss opened 2 weeks ago
Flash attention isn't really useful anymore, sdpa works fine and sage_attn, if it's supported by your hardware and you can install it, is a lot faster.
Looks like Win11 is shit out of luck. Maybe I should re-install WSL.
You can use flash attention on WSL, but building the whl is hit or miss. Had to reinstall torch and cuda multiple times for it to work. Also check if your Nvidia Driver includes the same cuda or above of what you install in wsl2.
See above.