question about dropout implementation

sballas8 commented 9 years ago

Hi Andrej,

I have been learning a ton about RNNs and their implementation from looking through your code. I have a (perhaps silly) question about your dropout implementation. You claim that your code creates a mask that drops a fraction, drop_prob, of the units and then scales the remaining units by 1/(1-drop_prob). This doesn't seem correct to me since you are sampling using np.random.randn, which seems to sample from a normal distribution of mean 0 and variance 1.

For example, if you set drop_prob=1 (and ignore the fact that this makes your scale factor infinite) then you should be dropping all the units, but in reality you will be testing the boolean condition np.random.randn(some_shape)<(1-drop_prob). Since np.random.rand gives you negative values half the time (on average) you will only drop half the units (on average).

It seems like you want to be sampling from a uniform distribution from 0 to 1 in order for this to work properly.

Best, Sam

karpathy commented 9 years ago

Hey Sam, I think I use rand, which should be correct. rand give you values between [0,1]. Using randn would be a serious bug. Which lines are you referring to?

sballas8 commented 9 years ago

Ahh, that explains my confusion. The portion I was looking at was lines 48-59 in imagernn/rnn_generator.py and you do indeed use rand (not randn). I will make sure to read more carefully in the future.

Best, Sam

karpathy commented 9 years ago

phew, that could have a bad and hard to notice bug! thanks!

karpathy / neuraltalk

question about dropout implementation #29