yajiemiao / pdnn

PDNN: A Python Toolkit for Deep Learning. http://www.cs.cmu.edu/~ymiao/pdnntk.html
Apache License 2.0
224 stars 105 forks source link

Dropout gradient and adjustment factor #26

Open ghost opened 8 years ago

ghost commented 8 years ago

Hi everyone,

I was wondering if anyone else is having problem with using dropout. I got the error:

theano.gradient.NullTypeGradError: tensor.grad encountered a NaN. This variable is null because the grad method for input 4 () of the for{cpu,scan_fn} op is mathematically undefined. Depends on a shared variable.

Original code: return theano_rng.binomial(n=1, p=1-p, size=hid_out.shape, dtype=theano.config.floatX) * hid_out

Tried a couple of things:

  1. When I remove the binomial sampling, i.e. "return hid_out", everything works fine:
  2. Tried writing a function with theano_rng, which worked too: x = T.vector() bin = theano_rng.binomial(n=1, p=1-p, size=x.shape, dtype=theano.config.floatX) y = T.dot(bin,x) dy = theano.grad(y,x)
  3. Tried using numpy random numbers, which worked as well: hid_out *= numpy.float32(numpy.random.binomial([numpy.ones(n_out,dtype=theano.config.floatX)],1-p))

:confounded:

One more thing - if I'm not wrong, we should adjust for the dropouts? I've changed the code slightly below. Please let me know if this is not appropriate :)

return theano_rng.binomial(n=1, p=1-p, size=hid_out.shape, dtype=theano.config.floatX) * hid_out /(1-p)

sahiliitm commented 8 years ago

Hi Oogi, I am facing a similar issue. I have described it here. Please let me know if you figured out the problem.

Thanks