liuzechun / Bi-Real-net

Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm. In ECCV 2018 and IJCV
176 stars 39 forks source link

PyTorch Implementation: Forward Pass #18

Open killawhale2 opened 4 years ago

killawhale2 commented 4 years ago

First of all, thank you for sharing the PyTorch implementation, it's wonderful. I've been going over the code and found this line:
binary_weights = binary_weights_no_grad.detach() - cliped_weights.detach() + cliped_weights in the birealnet.py and was wondering what the purpose of this is for. My best guess is that it's to merely allow the gradients to exist without actually changing the values of the binary weights, but some helpful clarification would be wonderful!

CuauSuarez commented 4 years ago

The purpose is that the "forward" value is going to be the binarized weight (binary_weights_no_grad) while the value for obtaining the gradient (the "backward" value) is the clamped weight (cliped_weights).

This trick is also used for the binary_activation for using a binarized value for the forward pass while using the approximation for the backward pass.

ngunsu commented 4 years ago

binary_weights_no_grad is a floating-point tensor, sign(w) * scale. After the training is done, how can be converted to a binary weight?. I tried, in a naive way, to just use sign(w) without positive results. In essence, after using sign(w) over the trained weights, the network did not work anymore.