Closed ParisNeo closed 2 weeks ago
ExLlama doesn't import the flash_attn_2_cuda
function directly, so this error seems to be generated by the flash-attn
library. (?) I'd need more of the stack trace to say.
Does your flash-attn
match your PyTorch?
Had the same error as I was using a pre-built flash-attn
wheel on Windows, and had to rebuild my own after updating torch
from 2.1 to 2.2.
Hi. It worked if I recompile everything on my PC. But the problem is that since I am integrating this to my lollms app, users complain about failed compilation. I don't want to force them to install visual studio or bundle its install with my tool for windows and build essentials for linux which would result in a more complex install procedure.
I really wish there were precompiled wheels that they can just use. Since my project is 100% free, non sponsored and I am not some one with big resources, I can't afford having an automatic build system for all possible platform :(
ImportError: DLL load failed while importing flash_attn_2_cuda