TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。
Apache License 2.0
926
stars
107
forks
source link
ChatGLM-6B full parameter fine-tuning loss = 0 #44
Today, when I was doing the full parameter fine-tuning, my loss became 0.0 after the training of batch 17, and then it kept going to 0.0. I changed both the learning rate and batch, but I still failed to solve such problems
Today, when I was doing the full parameter fine-tuning, my loss became 0.0 after the training of batch 17, and then it kept going to 0.0. I changed both the learning rate and batch, but I still failed to solve such problems