junhyukoh / caffe-lstm

LSTM implementation on Caffe
Other
492 stars 250 forks source link

Question about 'lr_mult' #11

Open letitbehj opened 8 years ago

letitbehj commented 8 years ago

After reading the sample in http://caffe.berkeleyvision.org/tutorial/layers.html, I know 'lr_mult' is learning rate multipliers for the weights or the biases. But there are 3 'lr_mult' in deep_lstm_short.prototxt, what does the third lr_mult mean? I am still a novice in caffe, sorry to disturb you.

junhyukoh commented 8 years ago

There are three weights in the lstm layer. The first one corresponds to input-to-hidden weight The second one corresponds to hidden-to-hidden weight The third one corresponds to bias. So, the third lr_mult is the lr multiplier for the bias.

letitbehj commented 8 years ago

Thank U~

erinchen824 commented 6 years ago

I found that there are 3 lstm layers with the first one have the lr_mult. Does it mean the latter two lstm layers won't get param updated? BTW, the lr_mult for conv layer is 1 and 2 for weight and bias respectively. Here, the lr_mult is 1,1,2 for the upper mentioned three condition. Is 1,1,2 make any sense? Thanks! @junhyukoh