Open ayush29feb opened 7 years ago
Here are some thoughts:
For meancenterConvParms
, I think it is used to balance the positive weights and negative weights. When we approximate the weights with α defined by Equation (6), we want α to be a good estimation of the magnitude for both positive and negative weights. In this case, balancing the positive weights and negative weights is necessary for them to share the same α .
For clampConvParms
, they are using
to approximate the derivatives of the weights. In this case, if some weight w
has |w|>1
, then its derivative will always be zero and it will never get updated.
Hello,
I am trying to implement BWN in a alexnet-like network and was a little confused about the following code
I understand the binarizeConvParams operation that does the approximations explained in the paper. However, in the paper we don't talk about mean centering and clamping. Could anyone explain whats the rationale behind it?