CRBM momentum only applied to W

dfdx / Boltzmann.jl

Restricted Boltzmann Machines in Julia

Other

67 stars 27 forks source link

CRBM momentum only applied to W #46

Open davidbp opened 6 years ago

davidbp commented 6 years ago

I have seen that your function only seems to update dW for the weights but not for all the other parameters. Is it what you intented to do?

function grad_apply_momentum!(crbm::ConditionalRBM{T}, X::Mat{T},
                              dtheta::Tuple, ctx::Dict) where T
    dW, dA, dB, db, dc = dtheta
    momentum = @get(ctx, :momentum, 0.9)
    dW_prev = @get_array(ctx, :dW_prev, size(dW), zeros(T, size(dW)))
    # same as: dW += momentum * dW_prev
    axpy!(momentum, dW_prev, dW)
end

dfdx commented 6 years ago

I think momentum for other parameters didn't make it any better for my use case at that time, so I just decided to not include unchecked feature. However, if in your case it improves things, it makes sense to update the code.

Out of curiosity, what are using RBMs for? I thought everybody has moved to variational autoencoders which are much faster to train and more numerically stable. Although I'm not sure there's direct counterpart among VAEs for conditional RBM.

rofinn commented 6 years ago

Yeah, I'm pretty sure I was always setting momentum to 0.0 in my experiments... which is probably why I forgot to implement that for the autoregressive weights :) I'm not sure how much it'll help, but it's probably worth adding that in for consistency.