modelscope / dash-infer

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
Apache License 2.0
135 stars 15 forks source link

Add flash attention on intel-avx512 platform #18

Closed yejunjin closed 4 months ago