nndl / solutions

《神经网络与深度学习》课后习题答案-分享讨论
714 stars 80 forks source link

习题4-9 #15

Open fecet opened 4 years ago

fecet commented 4 years ago

可以缓解, 但不一定能得到理想的效果, 增大学习率可能使最优值被跨越, 也可能造成梯度爆炸.

Why can't we handle vanishing gradient problem in neural nets using large step sizes?

yangboyd commented 1 year ago

不能,导数值都是0,乘以任何学习率值都是0