agethen / ConvLSTM-for-Caffe

28 stars 15 forks source link

Incorrect gradient backpropagation? #2

Closed agethen closed 8 years ago

agethen commented 8 years ago

A user reported that gradients to underlying layers appear to be all zero. Two possible reasons may be:

screenshot 2016-10-26 22 28 03
agethen commented 8 years ago

Narrowing down the problem: It seems the bug occurs because of the implicit Split layer, that is spawned to clone input blob "x" for the four convolutions "input"/"forget"/"output"/"gate". Adding an additional 1x1 convolutional layer before the implicit Split layer fixes the issue.

Choices on how to proceed:

agethen commented 8 years ago

Issue fixed by replacing separate convolutional layers for three gates + gate activation by a single one with four times the channels.