A high-performance inference system for large language models, designed for production environments.
317
stars
24
forks
source link
fix: use a proper epsilon to avoid division by zero error for rejection sampler #202
Closed
guocuimi closed 1 month ago