flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
768 stars 64 forks source link

bugfix: fix the synchronization issue in distributed operators #290

Closed yzh119 closed 4 weeks ago

yzh119 commented 4 weeks ago

We missed some synchronizations in all reduce operators, which causes deadlock and incorrect result some times.