Closed aaronlyt closed 9 months ago
Hi there! Thank you for sharing your error and info with us. I tried to replicate your error, but the shape looks consistent on my end:
model.get_input_embeddings().weight.shape: torch.Size([32017, 4096])
model.get_input_embeddings().embedding_dim: 4096
model.get_input_embeddings().num_embeddings: 32017
Using the following code:
model_id = "epfl-llm/meditron-7b"
tokenizer = AutoTokenizer.from_pretrained(model_id, use_cache=True)
model = AutoModelForCausalLM.from_pretrained(
model_id, use_cache=True,
trust_remote_code=True,
device_map="auto")
Did you try deleting the HF cache and re-downloading the model weights from HF?
Thanks for you replay, I re-downloading the weights using AutoModelForCausalLM.from_pretrained, It works. Supplementary explanation: my previous oprations was directly donwloading the pytorch bin weights from HF, then loading, not working
operations
I download the model file from https://huggingface.co/epfl-llm/meditron-7b/tree/main then load the model using: model = transformers.AutoModelForCausalLM.from_pretrained('./meditron-7b/', trust_remote_code=True, use_cache=True)
get the error:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32000, 4096]) from checkpoint, the shape in current model is torch.Size([32017, 4096]).
package
transformer version is 4.25.2