Error(s) in loading state_dict for Transformer

Nikita-Sherstnev commented 10 months ago

I am running preparation script for CodeLlama: ./scripts/prepare.sh codellama/CodeLlama-13b-Instruct-hf And I got following error:

RuntimeError: Error(s) in loading state_dict for Transformer:
    size mismatch for tok_embeddings.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).
    size mismatch for output.weight: copying a param with shape torch.Size([32016, 5120]) from checkpoint, the shape in current model is torch.Size([32000, 5120]).

Chillee commented 10 months ago

I think model.py just doesn't have a config for CodeLlama-13B. You probably just need to add a config here: https://github.com/pytorch-labs/gpt-fast/blob/main/model.py#L53

Nikita-Sherstnev commented 10 months ago

Thanks you, this helped.

pytorch-labs / gpt-fast

Error(s) in loading state_dict for Transformer #32