chenxie95 / deeplearning_course_sjtu

14 stars 2 forks source link

语言模型codebase出错 #31

Open wonderingtom opened 7 months ago

wonderingtom commented 7 months ago

codebase使用RNN_RELU进行训练时会导致loss爆炸(nan)

chenxie95 commented 7 months ago

可以换成sigmoid看看,relu可能会导致梯度太大,或调整学习率