shibing624 / textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。
Apache License 2.0
926 stars 107 forks source link

ChatGLM-6B full parameter fine-tuning loss = 0 #44

Closed xiaojunjun65 closed 1 year ago

xiaojunjun65 commented 1 year ago

Today, when I was doing the full parameter fine-tuning, my loss became 0.0 after the training of batch 17, and then it kept going to 0.0. I changed both the learning rate and batch, but I still failed to solve such problems

1686814220592

c48c3a4aade14a25914550b827a0c3f

shibing624 commented 1 year ago

try to set float32, not fp16. i run full parameter fine-tuning success with llama, not chatglm ever.