meta-llama / llama-stack

Composable building blocks to build Llama Apps
MIT License
4.55k stars 571 forks source link

Are there any available tools that can convert the original .pth to safetensors #191

Open Itime-ren opened 1 month ago

Itime-ren commented 1 month ago

Are there any available tools that can convert the original .pth model files downloaded from Meta into a format usable by stack, or convert them to .safetensors format? I tried the tool from https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py, but it threw an error during execution.

raghotham commented 1 month ago

Can you share the errors you are seeing? An alternative is to also download the corresponding safetensor versions that we upload to huggingface as well.

Itime-ren commented 1 month ago

Can you share the errors you are seeing? An alternative is to also download the corresponding safetensor versions that we upload to huggingface as well.

https://github.com/huggingface/transformers/issues/33791

python3 /home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /Data_disk/meta_llama/meta_llama3.1/Meta-Llama-3.1-8B --model_size 8B --output_dir /Data_disk/meta_llama/safetensors/meta_llama3.1/llama3.1-8B

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message. Saving a LlamaTokenizerFast to /Data_disk/meta_llama/safetensors/meta_llama3.1/llama3.1-8B. Fetching all parameters from the checkpoint at /Data_disk/meta_llama/meta_llama3.1/Meta-Llama-3.1-8B. /home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py:157: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. loaded = torch.load(os.path.join(input_base_path, "consolidated.00.pth"), map_location="cpu") Loading the checkpoint in a Llama model. Loading checkpoint shards: 79%| ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 26/33 [00:04<00:01, 6.36it/s] Traceback (most recent call last): File "/home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 478, in main() File "/home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 465, in main write_model( File "/home/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py", line 324, in write_model model = LlamaForCausalLM.from_pretrained(tmp_model_path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True) File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4014, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4502, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 973, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py", line 373, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([128256, 4096]) in "weight" (which has shape torch.Size([128003, 4096])), this looks incorrect.