shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
3.21k stars 488 forks source link

rl 阶段,运行报错 ValueError: Got unexpected arguments: {'token_type_ids': tensor #166

Closed izhaomeng closed 1 year ago

izhaomeng commented 1 year ago

Describe the bug

Please provide a clear and concise description of what the bug is. If applicable, add screenshots to help explain your problem, especially for visualization related problems.

2023-08-16 15:17:05.558 | DEBUG | main:main:388 - Num train_samples: 2000 2023-08-16 15:17:06.959 | INFO | main:main:438 - Train 0it [00:34, ?it/s] Traceback (most recent call last): File "/home/ma-user/work/MedicalGPT/rl_training.py", line 481, in main() File "/home/ma-user/work/MedicalGPT/rl_training.py", line 459, in main score_outputs = [ File "/home/ma-user/work/MedicalGPT/rl_training.py", line 460, in get_reward_model_output(reward_model, reward_tokenizer, q, r, device) for q, r in File "/home/ma-user/work/MedicalGPT/rl_training.py", line 196, in get_reward_model_output score = reward_model(*inputs).logits[0].cpu().detach() File "/home/ma-user/anaconda3/envs/torch2.0-cu117/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "/home/ma-user/anaconda3/envs/torch2.0-cu117/lib/python3.9/site-packages/transformers/models/bloom/modeling_bloom.py", line 1041, in forward raise ValueError(f"Got unexpected arguments: {deprecated_arguments}") ValueError: Got unexpected arguments: {'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]], device='cuda:0')} ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2187053) of binary: /home/ma-user/anaconda3/envs/torch2.0-cu117/bin/python3.9

Pzeyang commented 1 year ago

image 要额外的reward_config