tloen / llama-int8

Quantized inference code for LLaMA models
GNU General Public License v3.0
1.05k stars 105 forks source link