单机多卡训练报错

nicole828 commented 1 year ago

你好，使用命令进行单机多卡训练的时候报错。命令如下： CUDA_VISIBLE_DEVICES=0,1,2,3 python src/train_bash.py \ --stage sft \ --model_name_or_path path_to_your_model \ --do_train \ --dataset alpaca_gpt4_zh \ --template chatml \ --finetuning_type lora \ --output_dir path_to_sft_checkpoint \ --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 1000 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --plot_loss \ --lora_target c_attn \ --fp16

报错如下：

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument index in method wrapper_CUDA__index_select)

请问，支持单机多卡训练吗？

ArtificialZeng commented 1 year ago

试试把命令换成shell脚本

nicole828 commented 1 year ago

改成shell脚本试了呢，还是同样的错误。

JasonFuuuuuuuu commented 1 year ago

试试把命令换成shell脚本

应该是代码里有错误

Ouya-Bytes commented 8 months ago

考虑在AutoModelForCausalLM.from_pretrained内指定参数device_map=“auto”。

ArtificialZeng / Qwen-Tuning

单机多卡训练报错 #1