Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
14.26k stars 1.33k forks source link

can we use this with project like XInference ? #1018

Open Greatz08 opened 4 months ago

Greatz08 commented 4 months ago

if yes then how ? :-))

tridao commented 4 months ago

I'm not familiar with XInference