I was wondering if anyone else is having problem with using dropout. I got the error:
theano.gradient.NullTypeGradError: tensor.grad encountered a NaN. This variable is null because the grad method for input 4 () of the for{cpu,scan_fn} op is mathematically undefined. Depends on a shared variable.
Original code:
return theano_rng.binomial(n=1, p=1-p, size=hid_out.shape, dtype=theano.config.floatX) * hid_out
Tried a couple of things:
When I remove the binomial sampling, i.e. "return hid_out", everything works fine:
Tried writing a function with theano_rng, which worked too:
x = T.vector()
bin = theano_rng.binomial(n=1, p=1-p, size=x.shape, dtype=theano.config.floatX)
y = T.dot(bin,x)
dy = theano.grad(y,x)
Tried using numpy random numbers, which worked as well:
hid_out *= numpy.float32(numpy.random.binomial([numpy.ones(n_out,dtype=theano.config.floatX)],1-p))
:confounded:
One more thing - if I'm not wrong, we should adjust for the dropouts? I've changed the code slightly below. Please let me know if this is not appropriate :)
Hi everyone,
I was wondering if anyone else is having problem with using dropout. I got the error:
theano.gradient.NullTypeGradError: tensor.grad encountered a NaN. This variable is null because the grad method for input 4 () of the for{cpu,scan_fn} op is mathematically undefined. Depends on a shared variable.
Original code: return theano_rng.binomial(n=1, p=1-p, size=hid_out.shape, dtype=theano.config.floatX) * hid_out
Tried a couple of things:
:confounded:
One more thing - if I'm not wrong, we should adjust for the dropouts? I've changed the code slightly below. Please let me know if this is not appropriate :)
return theano_rng.binomial(n=1, p=1-p, size=hid_out.shape, dtype=theano.config.floatX) * hid_out /(1-p)