X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.33k stars 176 forks source link

换了几个数据集,loss一直是nan #92

Closed hangzeli08 closed 1 year ago

hangzeli08 commented 1 year ago

有遇到这种情况的吗,怎么解决的啊

MAGAer13 commented 1 year ago

Since your training set contains the sample that would not have valid ground truth. For example, the maximum length of the model is set to 256, and your instruction is longer than 256, the response would be truncated. In this case, your training sample would not produce valid loss since the instruction would not compute the loss. One way to fix this is to increase the maximum length or try to shorten your instruction.

hangzeli08 commented 1 year ago

和长度没关系,长度够的。请问还有可能是什么原因

MAGAer13 commented 1 year ago

I mean some samples would have nan. The printed loss is just accumulated number statistics. It would not affect the model training, the model would skip it automatically.