[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
The command I run:
'''
python llama3.py --pruning_ratio 0.25 \
--device cuda --eval_device cuda \
--base_model home/Meta-Llama-3-8B \
--block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \
--block_attention_layer_start 4 --block_attention_layer_end 30 \
--save_ckpt_log_name llama3_prune \
--pruner_type taylor --taylor param_first \
--max_seq_len 2048 \
--test_after_train --test_before_train --save_model
'''
When running to line 259 of the code, an error occurs:
How to solve this problem? Looking forward to your reply!
The command I run: ''' python llama3.py --pruning_ratio 0.25 \ --device cuda --eval_device cuda \ --base_model home/Meta-Llama-3-8B \ --block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \ --block_attention_layer_start 4 --block_attention_layer_end 30 \ --save_ckpt_log_name llama3_prune \ --pruner_type taylor --taylor param_first \ --max_seq_len 2048 \ --test_after_train --test_before_train --save_model ''' When running to line 259 of the code, an error occurs: How to solve this problem? Looking forward to your reply!