Closed escorciav closed 7 years ago
Yes, The weights in the first and the last layers are real-value. We do not binarize them.
Hi, was there a technical reason for that design choice ? I suppose you tried a fully binarized network. How much was the performance drop ?
Hi, If I want to binarize the first and last layer for the network should I just change
for i =2, #convNodes-1 do
to i=1,#convNodes in binarizeConvParms
? What about updateBinaryGradWeight
when I change meancenterConvParms
binarizeConvParms
clampConvParms
to for i =1, #convNodes do the weights are very small i.e. 10^-33 and the accuracy are very low..~10%.
So what is the way to binarize all the layers?
Yes, That's why the first and last layers are not binarized. I got the same results too. From my experiments the first layer is the most critical. If it is binarized, the network won't learn at all. But the last layer also have a big impact on the accuracy when binarized.
Hi @Conchylicultor thank you very much for your reply..I have tried to not using binarized first layer, the accuracy improved. However, when I extracted the weights from the trained model, from the weights I dont think they are binarized. below shows an example kernel, they dont seem to be binarized. This is the second conv layer btw.
4.2289e-03 5.0269e-03 -4.6857e-02 3.1741e-02 -3.5040e-02
5.6437e-02 -3.0367e-02 1.8012e-02 4.0737e-02 -7.4001e-02
-4.3729e-02 5.5849e-02 3.2189e-03 -1.8107e-03 -1.7123e-02
5.7789e-03 -8.3973e-03 -9.4293e-03 6.4789e-02 -1.3312e-02
9.9191e-02 -9.3009e-03 -3.0903e-02 3.6413e-02 6.4425e-02
So even if the weights are binarized, there is still a floating scale and shift (like batch normalization). There is a different scale/shift value per channels (so 256 for the layer 2). If you plot the weights for a single channel, their should be all the same (ex: 4.2289e-03 -4.2289e-03 -4.2289e-03 4.2289e-03 ...
)
Thanks for clarifying ..from the understanding of the code I get this point..however it's just the weights from my model are not binerized for some reasons..
From what I remember, you have to explicitly call binarizeConvParms
as here https://github.com/allenai/XNOR-Net/blob/master/train.lua#L172
I think the models are saved as regular float but are binarized before the forward pass.
Thanks again!
Hi
According to this line, it seems that first and last layers of Binary-weights-network are not binary. Do I miss something?
thanks for providing the code to reproduce your work.
ping @mrastegari