baichuan-inc / Baichuan2

A series of large language models developed by Baichuan Intelligent Technology
https://huggingface.co/baichuan-inc
Apache License 2.0
4.08k stars 293 forks source link

baichuan2-13b-chat微调错误:Expecting property name enclosed in double quotes #353

Open jiaweiLL opened 8 months ago

jiaweiLL commented 8 months ago

Traceback (most recent call last): File "/data/ecid/llm/serve/Baichuan2/fine-tune/fine-tune.py", line 153, in train() File "/data/ecid/llm/serve/Baichuan2/fine-tune/fine-tune.py", line 112, in train model_args, data_args, training_args = parser.parse_args_into_dataclasses() File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/site-packages/transformers/hf_argparser.py", line 338, in parse_args_into_dataclasses obj = dtype(**inputs) File "", line 124, in init File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/site-packages/transformers/training_args.py", line 1730, in post_init self.hf_deepspeed_config = HfTrainerDeepSpeedConfig(self.deepspeed) File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/site-packages/transformers/integrations/deepspeed.py", line 88, in init super().init__(config_file_or_dict) File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/site-packages/transformers/integrations/deepspeed.py", line 78, in init super().init__(config_file_or_dict) File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/site-packages/accelerate/utils/deepspeed.py", line 49, in init config = json.load(f) File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/json/init.py", line 293, in load return loads(fp.read(), File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/data/ecid/.conda/envs/baichuan2-13b-chat/lib/python3.10/json/decoder.py", line 353, in raw_decode obj, end = self.scan_once(s, idx) json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 16 column 3 (char 385) [2024-01-15 13:26:38,976] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 3143453

下面是数据: [ { "id": "77771", "conversations": [ { "from": "human", "value": "写一篇关于气候变化对海洋生态的影响的文章。\n" }, { "from": "gpt", "value": "好的,以下是你要求的文章:\n气候变化对海洋生态的影响\n气候变化对海洋生态产生了广泛的影响。全球变暖导致海洋温度上升,这可能会对许多水生生物的生存、繁殖和迁移造成巨大的影响。另外,海洋酸化也是一个问题,因为大气中二氧化碳的增加导致海水中的pH值下降,使得许多生物的外骨骼变得脆弱,从而影响它们的生存和繁殖。\n此外,气候变化还会引起海平面上升,这不仅对海岸线和沿海居民产生影响,还会对许多海洋生物的栖息地造成影响。一些岛屿上的陆地生物面临几乎完全消失的风险,而另一些生物范围则会扩大到更广泛的领域,这可能会导致新的竞争和生态平衡的改变。\n总之,气候变化并不只是一个地球气温的问题,它对海洋生态系统产生了广泛和深远的影响。\n" }, { "from": "human", "value": "从刚刚的文章中提取出气候变化对海洋生物繁殖的影响。\n" }, { "from": "gpt", "value": "文章已经提到过气候变化会对海洋生物的生存、繁殖和迁移造成影响。例如,海洋酸化会使得许多生物的外骨骼变得脆弱,进而影响它们的生存和繁殖。此外,海洋温度上升也可能会对很多生物的生存和繁殖产生负面影响。" } ] },

sh文件: hostfile="" deepspeed --include localhost:0 --hostfile="" fine-tune.py \ --report_to "none" \ --data_path "data/belle_chat_ramdon_10k.json" \ --model_name_or_path "/data/ecid/llm/model/Baichuan2-13B-Chat" \ --output_dir "/data/ecid/llm/model/Baichuan2-13B-Chat-fine-tune" \ --model_max_length 512 \ --num_train_epochs 4 \ --per_device_train_batch_size 16 \ --gradient_accumulation_steps 1 \ --save_strategy epoch \ --learning_rate 2e-5 \ --lr_scheduler_type constant \ --adam_beta1 0.9 \ --adam_beta2 0.98 \ --adam_epsilon 1e-8 \ --max_grad_norm 1.0 \ --weight_decay 1e-4 \ --warmup_ratio 0.0 \ --logging_steps 1 \ --gradient_checkpointing True \ --deepspeed ds_config.json \ --bf16 True \ --tf32 True

jiaweiLL commented 8 months ago

请问怎么解决呢?数据我没有动,用的这个belle_chat_ramdon_10k.json