flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
1.22k stars 115 forks source link

Add dtype checks for q-kv tensors #280

Closed Yard1 closed 4 months ago

Yard1 commented 4 months ago

Right now, we require q and kv tensors to have the same dtype, but that is not enforced, which can lead to cryptic memory errors in case of a misconfiguration. This PR adds a check to ensure that we prevent mismatched dtypes.