Some training details - Githubissues

JobNam commented 1 year ago

Hey, Thanks for such excellent work. I'm trying to replicate your work, can I ask you some details about the model training?

I follow the learning rate of 0.0001 set in the article and use the MSE loss function, but the value of the model's loss function diverges from the beginning, may I ask whether the model can eventually converge after a period of training? But I've done 28 rounds and it's still divergent.
I tried to reduce the learning rate to a suitable value, and the model began to converge, but after one or two rounds, the model began to diverge again. Is this a normal phenomenon?
I tried to add the sigmoid activation function after the output, and adjusted the learning rate to the appropriate value using the binary cross entropy loss to start the training. The model starts out convergent, but after a few rounds, as in problem 2, the model becomes continuously divergent. This may be a problem with my parameter Settings, I will continue to adjust and try a few times later.
If my model is convergent at the beginning, then my confidence maps output will always be a value close to 0, and no value close to 1 will appear. Have you ever had such a situation during your training? And what does the confidence maps output in your training process look like? Thanks and best regards.

JobNam commented 1 year ago

Hey, Thanks for such excellent work. I'm trying to replicate your work, can I ask you some details about the model training? 1. I follow the learning rate of 0.0001 set in the article and use the MSE loss function, but the value of the model's loss function diverges from the beginning, may I ask whether the model can eventually converge after a period of training? But I've done 28 rounds and it's still divergent. 2. I tried to reduce the learning rate to a suitable value, and the model began to converge, but after one or two rounds, the model began to diverge again. Is this a normal phenomenon? 3. I tried to add the sigmoid activation function after the output, and adjusted the learning rate to the appropriate value using the binary cross entropy loss to start the training. The model starts out convergent, but after a few rounds, as in problem 2, the model becomes continuously divergent. This may be a problem with my parameter Settings, I will continue to adjust and try a few times later. 4. If my model is convergent at the beginning, then my confidence maps output will always be a value close to 0, and no value close to 1 will appear. Have you ever had such a situation during your training? And what does the confidence maps output in your training process look like? Thanks and best regards.

I have successfully solved the above problems. I changed the batch_size from 2 to 4, the learning rate remained 0.0001, and the model was deployed on a 4090 server. The model successfully converged and showed the preliminary prediction ability. As to why the predicted graph is always a black background graph close to 0, I think the model has overfitting and reached a local optimal solution.Thanks again to the author for his contribution!

Zhuanglong2 commented 1 year ago

Thanks for your solution！ In fact, i did not know how to solve it. All in all, i think that deep learning is metaphysics. ------------------ 原始邮件 ------------------ 发件人: "Zhuanglong2/T-RODNet" @.>; 发送时间: 2023年7月4日(星期二) 下午4:43 @.>; @.***>; 主题: Re: [Zhuanglong2/T-RODNet] Some training details (Issue #8)

Hey, Thanks for such excellent work. I'm trying to replicate your work, can I ask you some details about the model training? 1. I follow the learning rate of 0.0001 set in the article and use the MSE loss function, but the value of the model's loss function diverges from the beginning, may I ask whether the model can eventually converge after a period of training? But I've done 28 rounds and it's still divergent. 2. I tried to reduce the learning rate to a suitable value, and the model began to converge, but after one or two rounds, the model began to diverge again. Is this a normal phenomenon? 3. I tried to add the sigmoid activation function after the output, and adjusted the learning rate to the appropriate value using the binary cross entropy loss to start the training. The model starts out convergent, but after a few rounds, as in problem 2, the model becomes continuously divergent. This may be a problem with my parameter Settings, I will continue to adjust and try a few times later. 4. If my model is convergent at the beginning, then my confidence maps output will always be a value close to 0, and no value close to 1 will appear. Have you ever had such a situation during your training? And what does the confidence maps output in your training process look like? Thanks and best regards.

I have successfully solved the above problems. I changed the batch_size from 2 to 4, the learning rate remained 0.0001, and the model was deployed on a 4090 server. The model successfully converged and showed the preliminary prediction ability. As to why the predicted graph is always a black background graph close to 0, I think the model has overfitting and reached a local optimal solution.Thanks again to the author for his contribution!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

linda-web commented 3 weeks ago

你好，我是在b站看到你的关于论文的讲解，感觉很不错。我是一个科研小白，请问能否给个联系方式，可以有偿问您一些问题吗。可以加我zyhyydi(这是我的微信号），谢谢！

Zhuanglong2 / T-RODNet

Some training details #8