mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
https://arxiv.org/abs/2211.10438
MIT License
1.26k stars 150 forks source link

How to quantize llama3? #92

Open jpyo0803 opened 4 months ago

jpyo0803 commented 4 months ago

Hi,

I am wondering how to quantize llama3-8B with smoothquant.

What dataset did you use to generate activation scale?

Or do you plan to upload act_scales, model weights (to huggingface), and quantized version of source code(to github) for Llama3?

Thanks in advance!