Open Godlovecui opened 4 days ago
Hello, I install flashinfer by AOT, where to modify q_data_type into torch.bfloat16 in plan function?
Thank you~
I think currently vllm uses the v0.1.5 style api and you can specify the q_data_type in the begin_forward function.
begin_forward
Hello, I install flashinfer by AOT, where to modify q_data_type into torch.bfloat16 in plan function?
Thank you~