issues
search
tloen
/
llama-int8
Quantized inference code for LLaMA models
GNU General Public License v3.0
1.05k
stars
105
forks
source link
Can 65B run on 4*32G GPU?
#11
Open
zhongtao93
opened
1 year ago