horseee / LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
https://arxiv.org/abs/2305.11627
Apache License 2.0
886 stars 106 forks source link

Llama3 reports shape error after pruning #69

Open WentaoTan opened 3 months ago

WentaoTan commented 3 months ago

The command I run: ''' python llama3.py --pruning_ratio 0.25 \ --device cuda --eval_device cuda \ --base_model home/Meta-Llama-3-8B \ --block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \ --block_attention_layer_start 4 --block_attention_layer_end 30 \ --save_ckpt_log_name llama3_prune \ --pruner_type taylor --taylor param_first \ --max_seq_len 2048 \ --test_after_train --test_before_train --save_model ''' 1724320850470 When running to line 259 of the code, an error occurs: 1724320929175 How to solve this problem? Looking forward to your reply!

nagbhat25 commented 3 months ago

+1 I see the same issue too. tried with multiple settings but no luck. Any help would be appreciated.

xwang365 commented 3 months ago

+1 got same error

SunnyGJing commented 3 months ago

+1 got same error

WentaoTan commented 3 months ago

Has anyone solved this problem?

VincentZ-2020 commented 2 months ago

The line attn_output = attn_output.reshape(bsz, q_len, self.hidden_size) can be modified to attn_output = attn_output.reshape(bsz, q_len, -1) to resolve the issue.

PureEidolon commented 1 month ago

got same error

PureEidolon commented 1 month ago

The line attn_output = attn_output.reshape(bsz, q_len, self.hidden_size) can be modified to attn_output = attn_output.reshape(bsz, q_len, -1) to resolve the issue.

I tried this method, but it doesn't work.

In the modeling_llama.py file I downloaded, the source code is:

attn_output = attn_output.reshape(bsz, q_len, -1)