Open mi4da opened 4 weeks ago
{'loss': 0.0, 'learning_rate': 3.3112582781456954e-07, 'epoch': 0.01} 0%| | 3/30200 [01:39<256:54:21, 30.63s/it]warning: The size of tensor a (15968) must match the size of tensor b (52736) at non-singleton dimension 0, input_embeds[selected].shape=torch.Size([15968, 2048]), vit_embeds.shape=torch.Size([52736, 2048])
是因为学习率太小吗
感觉是梯度消失
我用4b模型 也是loss 0 你这个解决了吗
我用2b模型也是loss=0
Got the same with the 4B model, anyone have an idea on how to resolve this ?
I also had this problem and fixed it by setting the correct --conv_style parameter in the training script. in my case I had to change --conv_style to "internlm2-chat"
我的解决方案是确保图片不要有太过分的长宽比就可,如果有的话就用0 pad到正方形,loss就不为0了,
在 2024-06-13 16:25:10,"caidou05" @.***> 写道:
我用4b模型 也是loss 0 你这个解决了吗
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
是做的用lora的finetune吗?
不是
在 2024-07-05 15:43:18,wintercat1994 @.***> 写道:
是做的用lora的finetune吗?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
并且使用其finetune发现loss是0.0???