flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
1.21k stars 111 forks source link

refactor: remove `page_size` from template parameters for prefill kernels #306

Closed yzh119 closed 3 months ago

yzh119 commented 3 months ago

Similar to #301 , in this PR we remove page_size from template parameters so that we can support any page_size for prefill kernels (previously we only support something like 1,4,8,16,32), as well as reduce binary size and accelerate compilation time.