punica-ai / punica

Serving multiple LoRA finetuned LLM as one
https://arxiv.org/abs/2310.18547
Apache License 2.0
883 stars 40 forks source link

fix(bgmv): write shared_memory y_warpsize only when threadIdx.x == 0 #51

Open menggeliu1205 opened 1 month ago

menggeliu1205 commented 1 month ago

should add threadIdx.x == 0, when you want to write y_warpsize. Otherwise it will lead the wrong answer.

menggeliu1205 commented 1 month ago

thanks for replying and offer the other fix! i get it.