issues
search
vectorch-ai
/
ScaleLLM
A high-performance inference system for large language models, designed for production environments.
https://docs.vectorch.com/
Apache License 2.0
377
stars
28
forks
source link
refactor: only do sampling in driver worker (rank=0)
#247
Closed
guocuimi
closed
3 months ago
guocuimi
commented
3 months ago
this pr includes:
process group test with timeout
sampling in driver worker
this pr includes: