pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.58k stars 509 forks source link

Error(s) in loading state_dict for Transformer #32

Closed Nikita-Sherstnev closed 10 months ago

Nikita-Sherstnev commented 10 months ago

I am running preparation script for CodeLlama: ./scripts/prepare.sh codellama/CodeLlama-13b-Instruct-hf And I got following error:

RuntimeError: Error(s) in loading state_dict for Transformer:
    size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
    size mismatch for output.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
Chillee commented 10 months ago

I think model.py just doesn't have a config for CodeLlama-13B. You probably just need to add a config here: https://github.com/pytorch-labs/gpt-fast/blob/main/model.py#L53

Nikita-Sherstnev commented 10 months ago

Thanks you, this helped.