Local Reparameterization Trick

Hi,

As you know the Sparse VD paper mentions that we use both Local Reparameterization and Additive Noise Reparameterization to reduce the variance. I have noticed that Local Reparameterization Trick in Variational Dropout paper is not implemented here as opposed to the original code for Sparse VD. So I think there is a mistake here in the forward passes which needs to change:

def forward(self, input):
    if self.training:
        epsilon = self.weight.new(self.weight.shape).normal_()
        W = self.weight + epsilon * torch.exp(self.log_sigma2 / 2)
    else:
        W = self.weights_clipped
    return F.linear(input, W) + self.bias

Should change to:

self.weight = Parameter(torch.Tensor(in_features, out_features)) ## I changed the shape of self.weight

def forward(self, input):
    if self.training:
        W_mu = input.matmul(self.weight)
        std_w = torch.exp(self.log_sigma2)
        W_std = torch.sqrt((input.pow(2)).matmul(std_w.pow(2)))

        epsilon = self.W_std.new(self.W_std.shape).normal_()
        output = W_mu + W_std * epsilon
        output += self.bias
    else:
        W = self.weights_clipped
        output = W + self.bias
    return output

HolyBayes / pytorch_ard

Local Reparameterization Trick #7