jonathanmarek1 / binarynet-tensorflow

https://jonathanmarek1.github.io/binarynet-tensorflow/
63 stars 18 forks source link

Unclear weight binarization #9

Open nik123 opened 6 years ago

nik123 commented 6 years ago

I was trying to understand the code and I have problems with lines 18-30 in bnn.py:

coeff = np.float32(1./np.sqrt(1.5/ (np.prod(shape[:-2]) * (shape[-2] + shape[-1]))))
print(coeff)

tmp = y + coeff * (x - y)
tmp = tf.clip_by_value(tmp, -1.0, 1.0)
tmp = tf.group(x.assign(tmp), y.assign(tmp))
tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, tmp)

x = tf.clip_by_value(x, -1.0, 1.0)
xbin = tf.sign(x) * tf.reduce_mean(tf.abs(x), axis=[0, 1, 2])
x = x + tf.stop_gradient(xbin - x)

As far as I understand, binarization implemented according to this article: https://arxiv.org/abs/1602.02830

In this shouldn't it be enough to just use the following piece of code:

x = tf.Variable(init)
x = tf.clip_by_value(x, -1.0, 1.0)
xbin = tf.sign(x) * tf.reduce_mean(tf.abs(x), axis=[0, 1, 2])
x = x + tf.stop_gradient(xbin - x)

What is the purpose of additional coeff mulitplier and tmp and y variables? It seems like I'm missing something.