yaysummeriscoming / BinaryNet_and_XNORNet

Keras implementations of BinaryNet and XNORNet
55 stars 23 forks source link

Binarization Q's #3

Open tessselllation opened 5 years ago

tessselllation commented 5 years ago

Thanks for this amazing code @yaysummeriscoming !! So your passthroughSign function results in

y = 1 for x>0, 0 for x=0, -1 for x<0

But the BinaryNet paper specifies the binarization function to be

y = 1 for x>=0, -1 for x<0

Yet I notice all weights/activations are still only -1 or 1. Is this because the probability of obtaining an exact zero during training is negligible? Or am I missing a step somewhere?

Also, where can we see that the back propagation through the binarization layers is the HardTanh function?

yaysummeriscoming commented 5 years ago

Thanks :) You're correct about the passthroughSign function, that got me too. Remember that only the weight * activation multiplies are binary, summation is done in higher precision.

Regarding HardTanh - this is the same as clipped passthrough. You can see it in here: https://github.com/yaysummeriscoming/BinaryNet_and_XNORNet/blob/master/CustomOps/tensorflowOps.py

Specifically functions passthroughSignTF & clipped_passthrough_grad. I was still getting familiar with TF & keras at that stage, so the stop_gradient() trick would be prettier.

tessselllation commented 5 years ago

Thanks for your response @yaysummeriscoming! Silly me I totally missed how you brought clipped_passthrough_grad into def passthroughTanhTF(x). This method seems great to me, but I am very new to Keras and TensorFlow.

I've implemented the bipolar regulariser from How to Train a Compact Binary Neural Network with High Accuracy? https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14619/14454 I notice you discussed this under Issue#1. It was actually pretty easy and the effect was quite noticeable.

Thanks again for the amazing code!