error when Fine-tuning a sliced model llama 3

RuntimeError: Error(s) in loading state_dict for UninitializedLlamaForCausalLM: size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([128256, 2864]) from checkpoint, the shape in current model is torch.Size([128256, 3072]). size mismatch for model.layers.0.mlp_shortcut_Q: copying a param with shape torch.Size([2864, 2864]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for model.layers.0.attn_shortcut_Q: copying a param with shape torch.Size([2864, 2864]) from checkpoint, the shape in current model is torch.Size([3072, 3072]). size mismatch for model.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([4096, 2864]) from checkpoint, the shape in current model is torch.Size([4096, 3072]). size mismatch for model.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 2864]) from checkpoint, the shape in current model is torch.Size([1024, 3072]). size mismatch for model.layers.0.self_attn.v_proj.weight: copying a param with shape torc

microsoft / TransformerCompression

error when Fine-tuning a sliced model llama 3 #155