InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
https://xtuner.readthedocs.io/zh-cn/latest/
Apache License 2.0
3.71k stars 302 forks source link

官方示例微调出现KeyError: 'need_eos_token' #433

Closed whitebitbit closed 6 months ago

whitebitbit commented 6 months ago

环境 datasets 2.17.1 transformers 4.37.1 xtuner 0.1.13

模型 internlm2-20b

示例链接 https://github.com/InternLM/xtuner/tree/main/examples/demo_data/pretrain

完整报错 Generating train split: 2 examples [00:00, 17.21 examples/s] num_proc must be <= 2. Reducing num_proc to 2 for dataset of size 2. Map (num_proc=2): 100%|█████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 10.66 examples/s] num_proc must be <= 2. Reducing num_proc to 2 for dataset of size 2. Filter (num_proc=2): 100%|██████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.11 examples/s] num_proc must be <= 2. Reducing num_proc to 2 for dataset of size 2. Map (num_proc=2): 0%| | 0/2 [00:00<?, ? examples/s] multiprocess.pool.RemoteTraceback: """ Traceback (most recent call last): File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker result = (True, func(args, kwds)) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 625, in _write_generator_to_queue for i, result in enumerate(func(kwargs)): File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3458, in _map_single example = apply_function_on_filtered_inputs(example, i, offset=offset) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3361, in apply_function_on_filtered_inputs processed_inputs = function(fn_args, *additional_args, **fn_kwargs) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/dataset/utils.py", line 97, in encode_fn if single_turn_conversation['need_eos_token']: KeyError: 'need_eos_token' """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/tools/train.py", line 299, in main() File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/tools/train.py", line 295, in main runner.train() File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1160, in train self._train_loop = self.build_train_loop( File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 965, in build_train_loop loop = EpochBasedTrainLoop( File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/loops.py", line 44, in init super().init(runner, dataloader) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/base_loop.py", line 26, in init self.dataloader = runner.build_dataloader( File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 824, in build_dataloader dataset = DATASETS.build(dataset_cfg) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, args, kwargs, registry=self) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(args) # type: ignore File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 156, in process_hf_dataset return process(args, kwargs) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 123, in process dataset = dataset.map( File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 593, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 558, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, kwargs) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3197, in map for rank, done, content in iflatmap_unordered( File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 665, in iflatmap_unordered [async_result.get(timeout=0.05) for async_result in async_results] File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 665, in [async_result.get(timeout=0.05) for async_result in async_results] File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get raise self._value KeyError: 'need_eos_token'

whitebitbit commented 6 months ago

是不是internlm2-20b-chat做增量预训练的脚本需要改动呢

LZHgrla commented 6 months ago

@whitebitbit Hi, 这一问题在 https://github.com/InternLM/xtuner/pull/361 中得到了解决。 请尝试安装最新版的xtuner

pip install git+https://github.com/InternLM/xtuner.git
whitebitbit commented 6 months ago

@whitebitbit Hi, 这一问题在 #361 中得到了解决。 请尝试安装最新版的xtuner

pip install git+https://github.com/InternLM/xtuner.git

thanks