issues
search
turboderp
/
exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
MIT License
2.67k
stars
214
forks
source link
custom stop tokens in generator.py
#223
Closed
Kerushii
closed
10 months ago
jeffrey-lunaon
commented
10 months ago
It helps me 👍
It helps me 👍