KarenUllrich / Tutorial_BayesianCompressionForDL

A tutorial on "Bayesian Compression for Deep Learning" published at NIPS (2017).
MIT License
203 stars 48 forks source link

Missing factor 0.5 in KL-divergence? #8

Closed jheek closed 5 years ago

jheek commented 6 years ago

The convolution layer seems to miss a factor 0.5 in front of the log variance term in the KL-divergence. The Dense layer does have this factor.

https://github.com/KarenUllrich/Tutorial_BayesianCompressionForDL/blob/a7d3d83410788b2b8ebf76de948af07fbe4922e2/BayesianLayers.py#L264

aswin-raghavan commented 6 years ago

I believe this is true. log(1/var) becomes -log(var), the 0.5 should be there unless log(sd) was modeled instead of log(var). EDIT: Seems to be correct in the fully connected layer on Line num. 141

clouizos commented 5 years ago

Apologies for the very late reply; I just noticed the issues here. Indeed you are correct, there should be a factor of 0.5 in front of the KL. We only tested the fully connected version for this release, so this fell through the cracks. Thanks for noticing!

aswin-raghavan commented 5 years ago

We used the solution posted before, but the LeNet model had lower performance than your paper. See https://arxiv.org/abs/1811.04985

I will try running this again. Aswin

On Dec 11, 2018 12:10 PM, Christos Louizos notifications@github.com wrote:

Closed #8https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FKarenUllrich%2FTutorial_BayesianCompressionForDL%2Fissues%2F8&data=01%7C01%7Caswin.raghavan%40sri.com%7C065f8c25ca82414e292008d65f8b8acc%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=AafJUQUE18%2FMP%2BboY1ma1UBSFMSir5fonVcWKDsUwok%3D&reserved=0.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FKarenUllrich%2FTutorial_BayesianCompressionForDL%2Fissues%2F8%23event-2019315930&data=01%7C01%7Caswin.raghavan%40sri.com%7C065f8c25ca82414e292008d65f8b8acc%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=q3hDvAO8MAPB%2FD7dFl1UDs%2B8wBDIwgeRgXLQb5ZhR5k%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAhK_yjqSw5u5DH_J3H4r5dt9YMEnTiMJks5u3-b-gaJpZM4V1QYi&data=01%7C01%7Caswin.raghavan%40sri.com%7C065f8c25ca82414e292008d65f8b8acc%7C40779d3379c44626b8bf140c4d5e9075%7C1&sdata=13QpGffhwU2ILoVKSuH6SXeb5tW12waJuWKr0uQD7NU%3D&reserved=0.