Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
https://github.com/Facico/Chinese-Vicuna
Apache License 2.0
4.14k stars 425 forks source link

使用finetune.sh来指令微调llama-33b,出现ZeroDivisionError: integer division or modulo by zero错误 #252

Closed BIUBIUBIU-JIAZHOU closed 1 year ago

BIUBIUBIU-JIAZHOU commented 1 year ago

在如下的环境的下训练 accelerate==0.15.0 appdirs==1.4.4 bitsandbytes==0.37.0 datasets==2.8.0 deepspeed==0.8.3 evaluate==0.4.0 fairscale==0.4.13 torch==1.13.1 torchvision==0.14.1 gradio==3.20.0 huggingface-hub==0.13.3 loralib==0.1.1 nvitop==1.0.0 peft sentencepiece==0.1.96 tensorboard==2.12.0 texttable==1.6.7 tokenizers==0.13.2 tqdm==4.65.0 transformers trlx wandb==0.13.10 triton==2.0.0

对于finetune.py中的如下代码 trainer.train(resume_from_checkpoint=args.resume_from_checkpoint)

报错ZeroDivisionError: integer division or modulo by zero

image
BIUBIUBIU-JIAZHOU commented 1 year ago

使用的是8张40G的A100训练

BIUBIUBIU-JIAZHOU commented 1 year ago

只修改了finetune.py中的batchsize

image