TigerResearch / TigerBot

TigerBot: A multi-language multi-task LLM
https://www.tigerbot.com
Apache License 2.0
2.24k stars 194 forks source link

救救我!!!在tigerbot-7b-base上微调报错 binascii.Error: Incorrect padding #59

Closed Smilefish1 closed 1 year ago

Smilefish1 commented 1 year ago

微调数据格式 image ,但是一直报binascii.Error: Incorrect padding错误,麻烦帮忙看看,感谢感谢

Smilefish1 commented 1 year ago

模型训练参数: deepspeed --include="localhost:0,1" train_sft.py \ --deepspeed /ds_config/ds_config_zero3.json \ --model_name_or_path /home/sunyard/source/model/TigerBot/tigerbot-7b-base \ --dataset_name /home/sunyard/source/data/data4/dev_sft/CodeGPT_java_zh_1391.json \ --do_train \ --output_dir /ckpt-sft \ --overwrite_output_dir \ --preprocess_num_workers 8 \ --num_train_epochs 5 \ --learning_rate 1e-5 \ --evaluation_strategy steps \ --eval_steps 10 \ --bf16 False \ --save_strategy steps \ --save_steps 10 \ --save_total_limit 2 \ --logging_steps 10 \ --tf32 False \ --per_device_train_batch_size 2 \ --per_device_eval_batch_size 2

chentigerye commented 1 year ago

这个像是数据错误,sft数据只认两个字段:instruction, output.。把input合并到instruction后面试试 instruction+"\n\n"+input

i4never commented 1 year ago

能提供完整的报错函数栈吗?

Smilefish1 commented 1 year ago

这个像是数据错误,sft数据只认两个字段:instruction, output.。把input合并到instruction后面试试 instruction+"\n\n"+input

感谢感谢,好像是我犯蠢了,但是我服务器内存不够,还是没部署起来,这个服务可以多卡部署吗