flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
1.22k stars 115 forks source link

3rdparty: add dependency to cutlass and composable kernels #269

Closed yzh119 closed 4 months ago

yzh119 commented 4 months ago

For group gemm in MoE and LoRA models, we use the combination of custom CUDA kernels and cutlass & composable kernels (for NVIDIA & AMD GPUs, correspondingly).

This PR adds dependency to the two repos.