shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
2.94k stars 451 forks source link

dpo训练出错 #342

Closed cove1011 closed 3 months ago

cove1011 commented 3 months ago

File "F:\xiazai\MedicalGPT-main\dpo_training.py", line 497, in main() File "F:\xiazai\MedicalGPT-main\dpo_training.py", line 472, in main train_result = trainer.train() File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\transformers\trainer.py", line 1624, in train return inner_training_loop( File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\transformers\trainer.py", line 1928, in _inner_training_loop for step, inputs in enumerate(epoch_iterator): File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\accelerate\data_loader.py", line 452, in iter current_batch = next(dataloader_iter) File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\torch\utils\data\dataloader.py", line 631, in next data = self._next_data() File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\torch\utils\data\dataloader.py", line 675, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\torch\utils\data_utils\fetch.py", line 54, in fetch return self.collate_fn(data) File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\trl\trainer\utils.py", line 332, in call to_pad = [torch.LongTensor(ex[k]) for ex in features] File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\trl\trainer\utils.py", line 332, in to_pad = [torch.LongTensor(ex[k]) for ex in features] TypeError: 'NoneType' object cannot be interpreted as an integer

cove1011 commented 3 months ago

C:\Users\admin.conda\envs\newrlhf\lib\site-packages\transformers\optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning warnings.warn( You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model. trainable params: 14823424 || all params: 6258407424 || trainable%: 0.23685616796302714 0%| | 0/100 [00:00<?, ?it/s]Traceback (most recent call last): File "F:\xiazai\MedicalGPT-main\dpo_training.py", line 497, in main() File "F:\xiazai\MedicalGPT-main\dpo_training.py", line 472, in main train_result = trainer.train() File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\transformers\trainer.py", line 1624, in train return inner_training_loop( File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\transformers\trainer.py", line 1928, in _inner_training_loop for step, inputs in enumerate(epoch_iterator): File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\accelerate\data_loader.py", line 452, in iter current_batch = next(dataloader_iter) File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\torch\utils\data\dataloader.py", line 631, in next data = self._next_data() File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\torch\utils\data\dataloader.py", line 675, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\torch\utils\data_utils\fetch.py", line 54, in fetch return self.collate_fn(data) File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\trl\trainer\utils.py", line 332, in call to_pad = [torch.LongTensor(ex[k]) for ex in features] File "C:\Users\admin.conda\envs\newrlhf\lib\site-packages\trl\trainer\utils.py", line 332, in to_pad = [torch.LongTensor(ex[k]) for ex in features] TypeError: 'NoneType' object cannot be interpreted as an integer 0%| | 0/100 [00:00<?, ?it/s]

shibing624 commented 3 months ago

自行在colab测试,注意库的版本。

cove1011 commented 3 months ago

没有用啊哥

badmic commented 3 months ago

@shibing624 @cove1011 我也是这个问题,请问解决了吗

huyidu commented 2 months ago

我也是这个问题,大家解决没?