Open SolshineCode opened 3 hours ago
This also happens with Qwen/Qwen2.5-0.5B
Loading base model: Qwen/Qwen2.5-0.5B
config.json: 100% 681/681 [00:00<00:00, 4.94MB/s]
model.safetensors: 100% 988M/988M [00:06<00:00, 146MB/s]
generation_config.json: 100% 138/138 [00:00<00:00, 1.02MB/s]
Loading models to merge:
Loading models: 50% 1/2 [00:01<00:01, 1.27s/it]
config.json: 100% 729/729 [00:00<00:00, 4.58MB/s]
model.safetensors: 0% 0.00/988M [00:00<?, ?B/s]
model.safetensors: 1% 10.5M/988M [00:00<00:12, 78.9MB/s]
model.safetensors: 3% 31.5M/988M [00:00<00:07, 128MB/s]
model.safetensors: 6% 62.9M/988M [00:00<00:05, 181MB/s]
model.safetensors: 8% 83.9M/988M [00:00<00:07, 117MB/s]
model.safetensors: 13% 126M/988M [00:00<00:05, 168MB/s]
model.safetensors: 15% 147M/988M [00:00<00:05, 168MB/s]
model.safetensors: 18% 178M/988M [00:01<00:04, 188MB/s]
model.safetensors: 21% 210M/988M [00:01<00:03, 211MB/s]
model.safetensors: 24% 241M/988M [00:01<00:03, 226MB/s]
model.safetensors: 28% 273M/988M [00:01<00:03, 235MB/s]
model.safetensors: 31% 304M/988M [00:01<00:02, 238MB/s]
model.safetensors: 34% 336M/988M [00:01<00:02, 241MB/s]
model.safetensors: 37% 367M/988M [00:01<00:02, 243MB/s]
model.safetensors: 40% 398M/988M [00:01<00:02, 243MB/s]
model.safetensors: 44% 430M/988M [00:02<00:02, 242MB/s]
model.safetensors: 47% 461M/988M [00:02<00:02, 239MB/s]
model.safetensors: 50% 493M/988M [00:02<00:02, 243MB/s]
model.safetensors: 53% 524M/988M [00:02<00:01, 244MB/s]
model.safetensors: 56% 556M/988M [00:02<00:01, 233MB/s]
model.safetensors: 59% 587M/988M [00:02<00:01, 247MB/s]
model.safetensors: 63% 619M/988M [00:02<00:01, 240MB/s]
model.safetensors: 66% 650M/988M [00:02<00:01, 240MB/s]
model.safetensors: 69% 682M/988M [00:03<00:01, 250MB/s]
model.safetensors: 72% 713M/988M [00:03<00:01, 247MB/s]
model.safetensors: 75% 744M/988M [00:03<00:01, 240MB/s]
model.safetensors: 79% 776M/988M [00:03<00:00, 246MB/s]
model.safetensors: 82% 807M/988M [00:03<00:00, 243MB/s]
model.safetensors: 85% 839M/988M [00:03<00:00, 248MB/s]
model.safetensors: 88% 870M/988M [00:03<00:00, 249MB/s]
model.safetensors: 91% 902M/988M [00:04<00:00, 245MB/s]
model.safetensors: 94% 933M/988M [00:04<00:00, 245MB/s]
model.safetensors: 100% 988M/988M [00:04<00:00, 226MB/s]
generation_config.json: 100% 117/117 [00:00<00:00, 710kB/s]
Loading models: 100% 2/2 [00:07<00:00, 3.81s/it]
tokenizer_config.json: 100% 7.23k/7.23k [00:00<00:00, 38.7MB/s]
vocab.json: 100% 2.78M/2.78M [00:00<00:00, 10.6MB/s]
merges.txt: 100% 1.67M/1.67M [00:00<00:00, 23.2MB/s]
tokenizer.json: 100% 7.03M/7.03M [00:00<00:00, 19.3MB/s]
Processing layer norms: 0it [00:00, ?it/s]
Processing embedding layers: 100% 2/2 [00:00<00:00, 19784.45it/s]
Processing linear layers: 100% 169/169 [00:01<00:00, 121.07it/s]
Total number of parameters: 1260786192
Total number of trainable parameters: 537360
Saving merged model to /content/merged_model
Traceback (most recent call last):
File "/content/DAM/dam/merge.py", line 267, in <module>
main()
File "/content/DAM/dam/merge.py", line 252, in main
merge_models(args.base_model_id,
File "/content/DAM/dam/merge.py", line 215, in merge_models
merged_model.save_pretrained(output_path)
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2793, in save_pretrained
safe_save_file(shard, os.path.join(save_directory, shard_file), metadata={"format": "pt"})
File "/usr/local/lib/python3.10/dist-packages/safetensors/torch.py", line 286, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
File "/usr/local/lib/python3.10/dist-packages/safetensors/torch.py", line 488, in _flatten
raise RuntimeError(
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'lm_head.weights.0', 'model.embed_tokens.embeddings.0'}, {'model.embed_tokens.embeddings.1', 'lm_head.weights.1'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
Description: I'm encountering an error while trying to merge models using the
merge.py
script. The process loads the models and processes the layers correctly, but when it attempts to save the merged model, aRuntimeError
is raised due to tensors sharing memory. Here's the detailed log:This issue regarding when running following command on notebook
Log Output:
Reproduction Steps:
merge.py
script with the following parameters:save_pretrained()
call when the merged model is being saved.Expected Behavior: The merged model should save correctly without errors.
Actual Behavior: The process fails during the save step due to the model having tensors that share memory. The error suggests using
save_model
to handle shared tensors more appropriately.Troubleshooting Attempts:
torch.save()
instead ofsafetensors
, which worked for saving but doesn’t resolve the root issue withmerge.py
.lm_head.weights
andtransformer.wte.embeddings
may be the shared tensors causing the problem.Request for Help:
Any guidance or suggestions to resolve this issue would be greatly appreciated!
Thank you for your time and help!