Closed whitebitbit closed 6 months ago
是不是internlm2-20b-chat做增量预训练的脚本需要改动呢
@whitebitbit Hi, 这一问题在 https://github.com/InternLM/xtuner/pull/361 中得到了解决。 请尝试安装最新版的xtuner
pip install git+https://github.com/InternLM/xtuner.git
@whitebitbit Hi, 这一问题在 #361 中得到了解决。 请尝试安装最新版的xtuner
pip install git+https://github.com/InternLM/xtuner.git
thanks
环境 datasets 2.17.1 transformers 4.37.1 xtuner 0.1.13
模型 internlm2-20b
示例链接 https://github.com/InternLM/xtuner/tree/main/examples/demo_data/pretrain
完整报错 Generating train split: 2 examples [00:00, 17.21 examples/s] num_proc must be <= 2. Reducing num_proc to 2 for dataset of size 2. Map (num_proc=2): 100%|█████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 10.66 examples/s] num_proc must be <= 2. Reducing num_proc to 2 for dataset of size 2. Filter (num_proc=2): 100%|██████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.11 examples/s] num_proc must be <= 2. Reducing num_proc to 2 for dataset of size 2. Map (num_proc=2): 0%| | 0/2 [00:00<?, ? examples/s] multiprocess.pool.RemoteTraceback: """ Traceback (most recent call last): File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker result = (True, func(args, kwds)) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 625, in _write_generator_to_queue for i, result in enumerate(func(kwargs)): File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3458, in _map_single example = apply_function_on_filtered_inputs(example, i, offset=offset) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3361, in apply_function_on_filtered_inputs processed_inputs = function(fn_args, *additional_args, **fn_kwargs) File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/dataset/utils.py", line 97, in encode_fn if single_turn_conversation['need_eos_token']: KeyError: 'need_eos_token' """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/tools/train.py", line 299, in
main()
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/tools/train.py", line 295, in main
runner.train()
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1160, in train
self._train_loop = self.build_train_loop(
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 965, in build_train_loop
loop = EpochBasedTrainLoop(
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/loops.py", line 44, in init
super().init(runner, dataloader)
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/base_loop.py", line 26, in init
self.dataloader = runner.build_dataloader(
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 824, in build_dataloader
dataset = DATASETS.build(dataset_cfg)
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, args, kwargs, registry=self)
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(args) # type: ignore
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 156, in process_hf_dataset
return process(args, kwargs)
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 123, in process
dataset = dataset.map(
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 593, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs)
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 558, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, args, kwargs)
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3197, in map
for rank, done, content in iflatmap_unordered(
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 665, in iflatmap_unordered
[async_result.get(timeout=0.05) for async_result in async_results]
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 665, in
[async_result.get(timeout=0.05) for async_result in async_results]
File "/public-supool/home/jzhou/miniconda3/envs/llm2/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get
raise self._value
KeyError: 'need_eos_token'