Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
https://github.com/Facico/Chinese-Vicuna
Apache License 2.0
4.14k stars 422 forks source link

finetune_chat.py 运行报错,IndexError: string index out of range #171

Closed reverse-2020 closed 1 year ago

reverse-2020 commented 1 year ago

代码如下 python finetune_chat.py --data_path merge_sample.json --test_size 1 报错如下

Traceback (most recent call last): File "finetune_chat.py", line 180, in train_data = train_val["train"].shuffle().map(PROMPT.preprocess_train, num_proc=num_proc) File "/home/vaeput/miniconda3/envs/llama/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 563, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, *kwargs) File "/home/vaeput/miniconda3/envs/llama/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 528, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, args, **kwargs) File "/home/vaeput/miniconda3/envs/llama/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3046, in map for rank, done, content in iflatmap_unordered( File "/home/vaeput/miniconda3/envs/llama/lib/python3.8/site-packages/datasets/utils/py_utils.py", line 1373, in iflatmap_unordered [async_result.get() for async_result in async_results] File "/home/vaeput/miniconda3/envs/llama/lib/python3.8/site-packages/datasets/utils/py_utils.py", line 1373, in [async_result.get() for async_result in async_results] File "/home/vaeput/miniconda3/envs/llama/lib/python3.8/site-packages/multiprocess/pool.py", line 771, in get raise self._value IndexError: string index out of range

reverse-2020 commented 1 year ago

解决了,需要使用chat的数据进行微调