epfLLM / meditron

Meditron is a suite of open-source medical Large Language Models (LLMs).
https://huggingface.co/epfl-llm
Apache License 2.0
1.85k stars 169 forks source link

load model size mismatch error #19

Closed aaronlyt closed 9 months ago

aaronlyt commented 9 months ago

operations

I download the model file from https://huggingface.co/epfl-llm/meditron-7b/tree/main then load the model using: model = transformers.AutoModelForCausalLM.from_pretrained('./meditron-7b/', trust_remote_code=True, use_cache=True)

get the error:

size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32000, 4096]) from checkpoint, the shape in current model is torch.Size([32017, 4096]).

package

transformer version is 4.25.2

eric11eca commented 9 months ago

Hi there! Thank you for sharing your error and info with us. I tried to replicate your error, but the shape looks consistent on my end:

model.get_input_embeddings().weight.shape: torch.Size([32017, 4096]) model.get_input_embeddings().embedding_dim: 4096 model.get_input_embeddings().num_embeddings: 32017

Using the following code:

model_id = "epfl-llm/meditron-7b"

tokenizer = AutoTokenizer.from_pretrained(model_id, use_cache=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id, use_cache=True, 
    trust_remote_code=True, 
    device_map="auto")

Did you try deleting the HF cache and re-downloading the model weights from HF?

aaronlyt commented 9 months ago

Thanks for you replay, I re-downloading the weights using AutoModelForCausalLM.from_pretrained, It works. Supplementary explanation: my previous oprations was directly donwloading the pytorch bin weights from HF, then loading, not working