, and it works perfectly. However, when I then using the quantized model for generation:
python generate.py --compile --checkpoint_path checkpoints/meta-llama/Llama-2-7b-chat-hf/model_int4-gptq.g32.pth --prompt "Hello, my name is"
, a size mismatch error occurs:
RuntimeError: Error(s) in loading state_dict for Transformer:
size mismatch for layers.0.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.0.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.1.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.1.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.2.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.2.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.3.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.3.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.4.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.4.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.5.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.5.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.6.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.6.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.7.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.7.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.8.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.8.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.9.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.9.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.10.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.10.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.11.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.11.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.12.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.12.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.13.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.13.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.14.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.14.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.15.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.15.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.16.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.16.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.17.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.17.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.18.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.18.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.19.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.19.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.20.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.20.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.21.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.21.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.22.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.22.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.23.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.23.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.24.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.24.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.25.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.25.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.26.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.26.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.27.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.27.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.28.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.28.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.29.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.29.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.30.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.30.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
size mismatch for layers.31.feed_forward.w2.weight: copying a param with shape torch.Size([512, 88, 32, 4]) from checkpoint, the shape in current model is torch.Size([512, 86, 32, 4]).
size mismatch for layers.31.feed_forward.w2.scales_and_zeros: copying a param with shape torch.Size([352, 4096, 2]) from checkpoint, the shape in current model is torch.Size([344, 4096, 2]).
Any solutions to fix or possible clues would be appreciated!
Hi, thanks for building this wonderful open-source project!
I am using GPTQ to first quantize a llama2-7b-chat-hf model:
, and it works perfectly. However, when I then using the quantized model for generation:
, a size mismatch error occurs:
Any solutions to fix or possible clues would be appreciated!