issues
search
turboderp
/
exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.77k
stars
220
forks
source link
multi stoptoken
#283
Closed
Kerushii
closed
1 year ago