This is the quantization scheme described in the DoReFa-Net paper. However, one big problem with this function is that it is not idempotent: if you apply it repeatedly it multiple times you may get different answers than just applying it once. That is, Q(Q(W)) != Q(W). This is not typically what you want in a quantization function. Normally you would expect a quantizer to be a no-op when applied to inputs that are already set to quantization targets.
While I can see that you would want to implement the DoReFa-Net function, I think it would make sense to make this behavior optional and also implement a simpler clipping based quantizer that would not suffer from this problem.
Currently when weight_bit > 1,
QFullyConnected
andQConvolution
quantize weights by first squashing them usingtanh
:This is the quantization scheme described in the DoReFa-Net paper. However, one big problem with this function is that it is not idempotent: if you apply it repeatedly it multiple times you may get different answers than just applying it once. That is,
Q(Q(W)) != Q(W)
. This is not typically what you want in a quantization function. Normally you would expect a quantizer to be a no-op when applied to inputs that are already set to quantization targets.While I can see that you would want to implement the DoReFa-Net function, I think it would make sense to make this behavior optional and also implement a simpler clipping based quantizer that would not suffer from this problem.