Open katopz opened 3 weeks ago
@katopz Sorry on the delay - will investigate and get back to you!
@danielhanchen I am also encountering a similar probklem (with an even simpler way to preproduce):
from unsloth import FastLanguageModel, is_bfloat16_supported
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Llama-3.2-1B-Instruct",
max_seq_length=100,
dtype=None,
load_in_4bit=True,
token = HF_TOKEN
)
model.save_pretrained("./out_trained_models/unsloth_1B_llama", save_safetensors=False)
tokenizer.save_pretrained("./out_trained_models/unsloth_1B_llama", save_safetensors=False)
I get the following error:
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'model.embed_tokens.weight', 'lm_head.weight'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
System Settings:
Driver Version: 535.183.01 CUDA Version: 12.2
unsloth @ git+https://github.com/unslothai/unsloth.git@1f52468fa31bf0b641ec96217ef0f5916a07fce5
safetensors==0.4.5
transformers==4.45.2
torch==2.4.1
I try to upload a
safetensor
to hf without success forunsloth/Llama-3.2-3B-Instruct
via an exampleget an error
Not sure any work around on this? Thanks!