yaringal / ConcreteDropout

Code for Concrete Dropout as presented in https://arxiv.org/abs/1705.07832
MIT License
245 stars 68 forks source link

weight_regularizer / dropout_regularizer missing factor 2 #8

Closed scenarios closed 5 years ago

scenarios commented 5 years ago

The current implementation of weight/dropout regularizer seems to be not proper for cross-entroy loss. If I understand it correctly, the factor 2 in length scale is removed in this code (you set \lambda = (l*2)(1-p) / (\tau N)).

This is proper for MSE loss. But the form of cross entroy loss is exactly the same as the negative log-likelihood version of Euclidean loss. So I think the factor 2 can not be removed in this case.

Thanks.

yaringal commented 5 years ago

Thanks for the message. The term is absorbed into the constant self.weight_regularizer