/lib/python3.10/site-packages/trl/core.py", line 139, in logprobs_from_logits
logpy = torch.gather(logp, 2, labels.unsqueeze(2)).squeeze(-1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
model device_map 设置为auto,奖励模型默认设置cuda:0上
model = AutoModelForCausalLMWithValueHead.from_pretrained(
args.model_name_or_path,
config=config,
torch_dtype=torch_dtype,
load_in_4bit=args.load_in_4bit,
load_in_8bit=args.load_in_8bit,
device_map=args.device_map,
trust_remote_code=args.trust_remote_code,
peft_config=peft_config if args.use_peft else None,
)
/lib/python3.10/site-packages/trl/core.py", line 139, in logprobs_from_logits logpy = torch.gather(logp, 2, labels.unsqueeze(2)).squeeze(-1) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
model device_map 设置为auto,奖励模型默认设置cuda:0上 model = AutoModelForCausalLMWithValueHead.from_pretrained( args.model_name_or_path, config=config, torch_dtype=torch_dtype, load_in_4bit=args.load_in_4bit, load_in_8bit=args.load_in_8bit, device_map=args.device_map, trust_remote_code=args.trust_remote_code, peft_config=peft_config if args.use_peft else None, )
奖励模型默认设置cuda:0上 reward_model = AutoModelForSequenceClassification.from_pretrained( args.reward_model_name_or_path, config=reward_config, load_in_8bit=args.load_in_8bit, trust_remote_code=args.trust_remote_code, ) reward_model.to(device)
model device_map 设置为auto,奖励模型默认设置cuda:0上,这样导致计算的时候不在统一gpu,大佬没遇到这种情况?多GPU的话一定会出现呢