Open yxy1995123 opened 2 years ago
I have the same question. I am confused why in the paper it is written that the sign function is used but the hardtanh is used in the implementation
the paper and code used the sign function for forward propagation and hardtanh for backward to preserve the gradient. This is the straight-through estimator.
I wonder this code can be used for outputing only 0 or 1 for weights?How