ruizhoud / DistributionLoss

Source code for paper "Regularizing Activation Distribution for Training Binarized Deep Networks"
31 stars 6 forks source link

One question about Line 247 in custom_main_binary_imagenet.py #4

Closed haichaoyu closed 5 years ago

haichaoyu commented 5 years ago

Hi Ruizhou,

This paper on weight regularization is quite interesting!

I have a question about the code here. After this line executed, all the weights are forced to be in [-1, 1]. If so, how could gradient mismatch be mitigated/solved?

Haichao

ruizhoud commented 5 years ago

Hi Haichao,

BNN quantize (1) weights and (2) activations to +1 or -1. This paper mainly tackles the three problems induced by activation quantization. This line of code is about weight quantization, and it just follows the previous work.

P.S. My feeling is that although the HardTanh approximation for weight quantization also has the mismatch problem, it is not a very severe problem compared to activation quantization, mainly because the mismatch for weight quantization does not accumulate across layers. But definitely it is an interesting problem to check.

Thanks, Ruizhou

haichaoyu commented 5 years ago

@ruizhoud Got it. Thanks.