THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Apache License 2.0
3.28k stars 235 forks source link

loss为0 #260

Closed Richard12868 closed 2 days ago

Richard12868 commented 2 days ago

System Info / 系統信息

系统信息完全默认,使用给定的数据,loss一直为0 {"messages": [{"role": "user", "content": "类型#裤材质#牛仔布风格#性感"}, {"role": "assistant", "content": "3x1的这款牛仔裤采用浅白的牛仔面料为裤身材质,其柔然的手感和细腻的质地,在穿着舒适的同时,透露着清纯甜美的个性气质。除此之外,流畅的裤身剪裁将性感的腿部曲线彰显的淋漓尽致,不失为一款随性出街的必备单品。"}]}

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

Running training Num examples = 1,010 Num Epochs = 24 Instantaneous batch size per device = 1 Total train batch size (w. parallel, distributed & accumulation) = 8 Gradient Accumulation steps = 1 Total optimization steps = 3,000 Number of trainable parameters = 2,785,280 {'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.0004996666666666667, 'epoch': 0.02} {'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.0004993333333333334, 'epoch': 0.03} {'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.000499, 'epoch': 0.05} {'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.0004986666666666667, 'epoch': 0.06} {'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.0004983333333333334, 'epoch': 0.08} {'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 0.000498, 'epoch': 0.09}

Expected behavior / 期待表现

loss有值

hecong97 commented 2 days ago

hello,请问这个问题是什么原因呢,我也遇到了