horseee / LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
https://arxiv.org/abs/2305.11627
Apache License 2.0
875 stars 104 forks source link

I tired Mistral 7b model, but I got this issue #56

Open TejasLidhure opened 7 months ago

TejasLidhure commented 7 months ago

LOGS:

You are using a model of type mistral to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/whiz/llmpruner/LLM-Pruner/hf_prune.py", line 314, in main(args) File "/root/whiz/llmpruner/LLM-Pruner/hf_prune.py", line 40, in main model = LlamaForCausalLM.from_pretrained( File "/root/miniconda3/envs/llmpruner/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3531, in from_pretrained ) = cls._load_pretrained_model( File "/root/miniconda3/envs/llmpruner/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3958, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/root/miniconda3/envs/llmpruner/lib/python3.9/site-packages/transformers/modeling_utils.py", line 812, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs) File "/root/miniconda3/envs/llmpruner/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 348, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([1024, 4096]) in "weight" (which has shape torch.Size([4096, 4096])), this look incorrect.