Extra comments in vl_nndropout on reweighting outputs

vlfeat / matconvnet

MatConvNet: CNNs for MATLAB

Other

1.4k stars 753 forks source link

Extra comments in vl_nndropout on reweighting outputs #208

Closed jrruijli closed 9 years ago

jrruijli commented 9 years ago

Hi,

Great code! Very readable and flexible.

I think it would be good to add a sentence to the documentation of vl_nndropout (especially since dropout is also not discussed in the tutorial). To me it was not immediately obvious that the average decrease in output for the dropout layer is already compensated inside the dropout function itself. So you may want to add a line like:

"Note that in the original paper on dropout, at test time the network weights for the dropout layers are scaled down to compensate for having all the neurons active. In this implementation the dropout function itself already does this compensation during training. So at test time no alterations are required."

Jasper

vedaldi commented 9 years ago

Hi, well noted. I added this to the devel branch.

On 21 Jul 2015, at 17:06, jrruijli notifications@github.com wrote:

Note that in the original paper on dropout, at test time the network weights for the dropout layers are scaled down to compensate for having all the neurons active. In this implementation the dropout function itself already does this compensation during training. So at test time no alterations are required

caoba1 commented 8 years ago

Retaining this issue... It is also good to mention that on the original paper the "dropout rate" is defined differently as in the implementation. So p in the JMLR paper is then 1 - p on the implementation.

JiaxYau commented 7 years ago

In the commet of vl_nndropout.m, rate is said to be the probability that a variable is retained. While in the implementation, the activation in backmode is scaled by 1/(1-rate), it seems that the rate is the probability a variable to be zeroed. I am a bit confused, which is right?

caoba1 commented 7 years ago

Exactly, in the JMLR paper the rate is the probability that the variable (feature "pixel") is retained. In the MATCONVNET implementaion it is the probability that the variable is "dropped", i.e. set to zero.