Open austinmw opened 2 years ago
same for me : Before temperature - NLL: 0.058, ECE: 0.002 Optimal temperature: 1.316 After temperature - NLL: 0.061, ECE: 0.010
Check if Model output is logits vector or softmax probs @NoSleepDeveloper @austinmw
same applies for me, the model is output logit vector, not softmax
I'm wondering if I could use ECE as optimization goal rather than NLL, if the overhead is not large? (Since there is problem above)
I don't think ECE is differenable bro
But that being siad, NLL is the metric that we should minise in order to make P(Y=y^|y^=f(x)) = f(x) [perfectly calibrated model, you may think the output probs follow a categorical distribution paramertirsed by f(x) ]
Try increasing the learning rate or increasing max_iter
. Your optimisation needs to converge. In the __init__
function of ModelWithTemperature
create an empty list to store the loss i.e.
self.loss = []
then before return loss
in the eval
function, append loss to the list
self.loss.append(loss.item())
After your call to set_temperature
, plot the values in the self.loss
list and see if the loss was minimised. The loss curve should taper off to some value that's somewhat constant after convergence.
After the optimization has converged, I still fail to get decreasing ECE.
I wonder, is it possible for us to get the optimal temperature by optimizing NLL loss on the validation set? I think it is a little strange.
Hi,
I ran this with a very simple 10 layer CNN model I trained on MNIST using pytorch lightning.
But the ECE ends up increasing instead of decreasing:
Any idea why this could be?