Open kylesayrs opened 1 month ago
Hmm indeed, this issue shouldn't pop up. @ydshieh, if you have the bandwidth, do you mind helping @kylesayrs out?
So far no clear idea yet, but when I tried with
"meta-llama/Llama-2-7b-hf"
all cases are passing.
Would it possible how you create the original Xenova/llama2.c-stories15M
?
"torch_dtype,tie_word_embeddings,device_map",
[
(torch.float16, True, "cpu" ),
(torch.float16, False, "cpu" ),
(torch.float16, True, "cuda:0" ),
(torch.float16, False, "cuda:0" ),
(torch.float32, True, "cpu" ),
(torch.float32, False, "cpu" ),
(torch.float32, True, "cuda:0"),
(torch.float32, False, "cuda:0"),
],
@ydshieh I do not know the details of how the model was created, but from the config.json it seems like it was saved with tie_word_embeddings=True
. Other models such as TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
do not have tied word embeddings and pass the test cases.
Some other models which have the same tie_word_embeddings=True
in the config such as "unsloth/Llama-3.2-3B-Instruct"
pass, so it may be an issue with this stories model in particular.
This model is not atypical in any way that I know of, but I'll do some more investigation to see if I notice anything different
Oh, ok. I thought you are the author of llama2.c-stories15M
😅 sorry. I would try to see if I can figure it out next week.
(just for me)
import torch
from transformers import AutoModelForCausalLM
configs = [
(torch.float32, False, "cpu" ), # fails
#(torch.float16, True, "cpu" ), # passes
#(torch.float16, False, "cpu" ), # passes
#(torch.float32, True, "cpu" ), # passes
#(torch.float32, False, "cpu" ), # fails
#(torch.float32, False, "cuda:0"), # passes
]
def test_model_save(torch_dtype, tie_word_embeddings, device_map, tmp_path="./"):
model = AutoModelForCausalLM.from_pretrained(
"Xenova/llama2.c-stories15M",
torch_dtype=torch_dtype,
tie_word_embeddings=tie_word_embeddings,
device_map=device_map,
)
model.save_pretrained(tmp_path, safe_serialization=True)
for config in configs:
test_model_save(*config)
System Info
platform: linux:
ubuntu 22.04
python version:3.10.12
transformers version:4.44.2
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
I expect
save_pretrained
to have the same behavior, regardless of model data type, and regardless of device