shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Apache License 2.0
2.94k stars 452 forks source link

SFT微调报错 #327

Closed ZhuangXialie closed 5 months ago

ZhuangXialie commented 5 months ago

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/share/huada/home/biwenshuai/zxl_2024/RELLM/MedicalGPT/supervised_finetuning.py", line 1462, in main() File "/home/share/huada/home/biwenshuai/zxl_2024/RELLM/MedicalGPT/supervised_finetuning.py", line 1158, in main train_dataset = train_dataset.shuffle().map( File "/home/share/huada/home/biwenshuai/miniconda3/envs/bw/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 592, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/home/share/huada/home/biwenshuai/miniconda3/envs/bw/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 557, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, **kwargs) File "/home/share/huada/home/biwenshuai/miniconda3/envs/bw/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3189, in map for rank, done, content in iflatmap_unordered( File "/home/share/huada/home/biwenshuai/miniconda3/envs/bw/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 1394, in iflatmap_unordered [async_result.get(timeout=0.05) for async_result in async_results] File "/home/share/huada/home/biwenshuai/miniconda3/envs/bw/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 1394, in [async_result.get(timeout=0.05) for async_result in async_results] File "/home/share/huada/home/biwenshuai/miniconda3/envs/bw/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get raise self._value KeyError: 'conversations'

shibing624 commented 5 months ago

看下wiki:https://github.com/shibing624/MedicalGPT/wiki