Please check that this issue hasn't been reported before.
[X] I searched previous Bug Reports didn't find any similar reports.
Expected Behavior
When my model completes and I try to do inference with it it should load without error
Current behaviour
My model is missing parameters and thus errors out when loading
[2024-10-06 21:07:57,939] [ERROR] [axolotl.load_model:808] [PID:45370] [RANK:0] Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([131344896]) from checkpoint, the shape in current model is torch.Size([128266, 4096]).
size mismatch for model.norm.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([128266, 4096]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
Traceback (most recent call last):
File "/root/axolotl/src/axolotl/utils/models.py", line 710, in load_model
model = AutoModelLoader.from_pretrained(
File "/root/.venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
return model_class.from_pretrained(
File "/root/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4014, in from_pretrained
) = cls._load_pretrained_model(
File "/root/.venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4559, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([131344896]) from checkpoint, the shape in current model is torch.Size([128266, 4096]).
size mismatch for model.norm.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([128266, 4096]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
Steps to reproduce
Train a model with my config, and any pre-tokenized dataset, and then try to run it
Please check that this issue hasn't been reported before.
Expected Behavior
When my model completes and I try to do inference with it it should load without error
Current behaviour
My model is missing parameters and thus errors out when loading
Steps to reproduce
Train a model with my config, and any pre-tokenized dataset, and then try to run it
Config yaml
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.10
axolotl branch-commit
main
Acknowledgements