mit-han-lab / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models
Apache License 2.0
224 stars 15 forks source link

GPTQ LLAMA 2 7B Question #16

Open XiaohanFei opened 3 months ago

XiaohanFei commented 3 months ago

When I run the GQTP W4A16 LLAMA 2 7B on A100, I have this Issue. I didn't get any bug report. Is this memory issue? quantizing weights: 78%|█████████████████████████████████████████████████████████████████████████████████▎ | 25/32 [18:16<12:14, 104.91s/it] collecting calibration activations in model.layers.25: 77%|█████████████████████████████████Killed

synxlin commented 1 week ago

Hi, May I know your configuration for running GPTQ quantization?