flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
768 stars 64 forks source link

bugfix: use `FlagHeads` instead of `SubtractLeft` for cuda 118 #265

Closed yzh119 closed 1 month ago

yzh119 commented 1 month ago

The cub version in cuda 118 is too old to use SubtractLeft, use old FlagHeads api instead.

Related issue #261