Open etienne87 opened 6 years ago
I think you are right. we can see that the network learns because the batchnorm parameters do change. But that, of course, is not enough to reach high accuracies....
In the file main_binary.py line 252, there is the loss.backward(). I think the backward pass has already done! Also, are you really sure that the weights are not changing? Maybe you should try a higher leaning rate to make sure the code is ok. I think the gradient may be too small to change the signs of the weights. @etienne87 @jafermarq @itayhubara
i'm printing the weights of the network and they are not changing. It makes sense since all the binarization is happening only on the data (not in the graph, so weight will not update)
how can this code trains a networks for scratch with binarization?