Open avishaiElmakies opened 1 day ago
Hi 👋 Am able to reproduce this, checking this!
can you please provide the code snippet where you are not seeing any error (without using safe tensors) @avishaiElmakies
There should be a single "error" about lm_head.weight, since the model uses weight tieing for the embeeding and output layer. Both safetensors and normal loading does this.
the problem is that when using safetensors the embedding layer seems to be missing which causes problems with both the embedding layer and the output layer.
maybe I should have been more clear about that in the bug report (sorry about that).
System Info
transformers
version: 4.46.2Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
mobilellm = AutoModelForCausalLM.from_pretrained("facebook/MobileLLM-125M",trust_remote_code=True)
will output
Some weights of MobileLLMForCausalLM were not initialized from the model checkpoint at facebook/MobileLLM-125M and are newly initialized: ['model.embed_tokens.weight']
and the weights will be random. when using
use_safetensor=False
. everything seems to work as expected.Expected behavior
using safetensors should work the same as when not using them.