flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
1.49k stars 147 forks source link

ValueError: The dtype of q torch.bfloat16 does not match the q_data_type torch.float16 specified in plan function. #638

Open Godlovecui opened 4 days ago

Godlovecui commented 4 days ago

Hello, I install flashinfer by AOT, where to modify q_data_type into torch.bfloat16 in plan function? image

Thank you~

yzh119 commented 4 days ago

I think currently vllm uses the v0.1.5 style api and you can specify the q_data_type in the begin_forward function.