IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.81k stars 145 forks source link

Reconstruct Quantized Model Layer in torch. #55

Open puja93 opened 2 months ago

puja93 commented 2 months ago

Hi, after quantizing LLAMA3, the layer sort of expanded in this checkpoint :

image

Now i can load the original Llama3 just fine using the layers provided here https://github.com/meta-llama/llama3/blob/main/llama/model.py, because the checkpoint has the same corresponding layers.

I wonder if you guys have written model.py for quantized llama model.

Thanks