Small error in the kl-divergence

KarenUllrich / Tutorial_BayesianCompressionForDL

A tutorial on "Bayesian Compression for Deep Learning" published at NIPS (2017).

MIT License

203 stars 48 forks source link

Small error in the kl-divergence #1

Closed manuelhaussmann closed 6 years ago

manuelhaussmann commented 6 years ago

Unless I am missing something there is a slight error in the kl_divergence() definition in the _ConvNdGroupNJ class.

KLD_element = -self.weight_logvar + 0.5 * (self.weight_logvar.exp().pow(2) + self.weight_mu.pow(2)) - 0.5

treats self.weight_logvar as if it was the log std instead of the log variance. The correct expression should be (as in LinearGroupNJ):

KLD_element = -0.5 * self.weight_logvar + 0.5 * (self.weight_logvar.exp() + self.weight_mu.pow(2)) - 0.5

clouizos commented 6 years ago

Yes, you are indeed correct, thanks for pointing it out! We only tested the dense implementation so this fell through the cracks. We just pushed the fix.