Closed yzh119 closed 3 weeks ago
In #294 , we set padded_batch_size_ to num_kv_heads * batch_size when no splitting kv, which should be batch_size
padded_batch_size_
num_kv_heads * batch_size
batch_size
In #294 , we set
padded_batch_size_
tonum_kv_heads * batch_size
when no splitting kv, which should bebatch_size