gpleiss / temperature_scaling

A simple way to calibrate your neural network.
MIT License
1.09k stars 159 forks source link

How to make the ECE figure #4

Closed zihaolucky closed 6 years ago

zihaolucky commented 6 years ago

Hi @gpleiss could you give a simple code on the figure in this style?

ts

Thanks!

gpleiss commented 6 years ago

Sure thing! Give me a day or two to clean up my plotting code and I'll post a gist!

zihaolucky commented 6 years ago

@gpleiss Great!

zihaolucky commented 6 years ago

I'm trying to reproduce the result in my own application, the ECE could be reduced from 0.016 to 0.008 😂It works but the origin ECE is quite small already.

gpleiss commented 6 years ago

Yeah sometimes you get lucky and a model will actually be pretty well calibrated to begin with!

zihaolucky commented 6 years ago

I'm working on NLP problem, and I did see the NLL loss growing up during training.

zihaolucky commented 6 years ago

@gpleiss

By diving into the (bin_lower, bin_upper, prop_in_bin.data[0], accuracy_in_bin.data[0], avg_confidence_in_bin.data[0])), I found the low confidence bin has large error but the ECE is very small as it's a weighted metric.

The model is still not well calibrated, the final ECE plot is very crucial.

zihaolucky commented 6 years ago

@gpleiss How this going?

gpleiss commented 6 years ago

https://gist.github.com/gpleiss/0b17bc4bd118b49050056cfcd5446c71

This is a rough sketch of how to plot the code. You need all the outputs of a model in a tensor (outputs) and the labels.

I haven't run this code, so there's probably a typo in it. But it should give you a rough idea of what to do.

zihaolucky commented 6 years ago

@gpleiss Thanks!

SophieChang66 commented 4 years ago

@zihaolucky hi, this is my network output of the code: Before temperature - NLL: 0.075, ECE: 0.013 Optimal temperature: 1.599 After temperature - NLL: 0.062, ECE: 0.010 the original ECE is small, how do you calibrate your network when the ECE is small ?

gpleiss commented 4 years ago

@SophieChang66 if your ECE is small, then it is well calibrated already :)