ChatGLM-6B full parameter fine-tuning loss = 0

shibing624 / textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型，实现了包括LLaMA，ChatGLM，BLOOM，GPT2，Seq2Seq，BART，T5，UDA等模型的训练和预测，开箱即用。

Apache License 2.0

926 stars 107 forks source link

ChatGLM-6B full parameter fine-tuning loss = 0 #44

Closed xiaojunjun65 closed 1 year ago

xiaojunjun65 commented 1 year ago

Today, when I was doing the full parameter fine-tuning, my loss became 0.0 after the training of batch 17, and then it kept going to 0.0. I changed both the learning rate and batch, but I still failed to solve such problems

1686814220592

c48c3a4aade14a25914550b827a0c3f

shibing624 commented 1 year ago

try to set float32, not fp16. i run full parameter fine-tuning success with llama, not chatglm ever.