Closed flipdazed closed 8 years ago
Issue There may be a sign error in the RBM gradient descent code provided on [deeplearning.net]()
Affected
rbm.py dbn.py test_rbm.py test_dbn.py train_rbm.py train_dbn.py
Code
The cost is calculated in models.rbm.RBM.getCostUpdates as,
models.rbm.RBM.getCostUpdates
cost = T.mean(self.freeEnergy(self.inputs)) - T.mean( self.freeEnergy(chain_end))
with the gradient given by
# We must not compute the gradient through the gibbs sampling gparams = T.grad(cost, self.params, consider_constant=[chain_end])
but in the theory this is the negative gradient. However, this gradient is still subtracted from params in the update loop,
params
for gparam, param in zip(gparams, self.params): updates[param] = param - gparam * T.cast(lr, dtype=theano.config.floatX)
see equation (1) in deeplearning.net where $l(\theta, \mathcal{D}) = - \langle \ln p(x) \rangle $
Issue There may be a sign error in the RBM gradient descent code provided on [deeplearning.net]()
Affected
Reproduction
Code
The cost is calculated in
models.rbm.RBM.getCostUpdates
as,with the gradient given by
but in the theory this is the negative gradient. However, this gradient is still subtracted from
params
in the update loop,