Open vivkul opened 5 years ago
Hi, I was wondering that as dropout is used, shouldn't we be using original_model.eval() for temperature scaling?
As the logits obtained using one forward pass is used to get temperature (instead of a different forward pass for each LBFGS step). Thanks!
Wondering the same thing. The code might not use dropout as it were set to 0, but it obviously use batchnorm.
Any answer on this yet?
Hi, I was wondering that as dropout is used, shouldn't we be using original_model.eval() for temperature scaling?
As the logits obtained using one forward pass is used to get temperature (instead of a different forward pass for each LBFGS step). Thanks!