Closed jiahuanluo closed 1 year ago
我在ChatGLM和LLaMA的efficient_tuning下,用deepspeed和accelerate的多卡infer同报错 accelerate launch \ ./LLaMA-Efficient-Tuning/src/train_bash.py \ --max_samples 50 \ --model_name_or_path "Llama-2-13B-fp16/" \ --do_predict \ --dataset alpaca_zh \ --dataset_dir "LLaMA-Efficient-Tuning/data"\ --finetuning_type lora \ --output_dir Efficient_Tuning/llama2-13b \ --per_device_eval_batch_size 4 \ --predict_with_generate \ --fp16 蹲解决方法
目前 do_predict 仅支持单卡
LLaMA-Efficient-Tuning 之前是可以的,代码越更新越多bug了
是这样的 6月份的版本可以accelerate launch多卡
@kuailehaha pull 最新的代码试一下
RuntimeError: Tensors must be contiguous
occurs whenper_device_eval_batch_size
> 1 cmd:deepspeed --include localhost:0,1,2,3,4,5,6,7 --master_port $MASTER_PORT src/train_bash.py \ --stage sft \ --model_name_or_path THUDM/chatglm2-6b \ --checkpoint_dir ${CHECKPOINT} \ --do_predict \ --dataset dev_data\ --overwrite_cache \ --finetuning_type lora \ --output_dir ${CHECKPOINT}/predict \ --overwrite_cache \ --per_device_eval_batch_size 4 \ --max_source_length 1024 \ --max_target_length 128 \ --max_samples 1000 \ --predict_with_generate \ --plot_loss \ --fp16