gpleiss / temperature_scaling

A simple way to calibrate your neural network.
MIT License
1.09k stars 159 forks source link

How to keep the temperature to be positive in .set_temperature() ? #31

Open eugene-yh opened 2 years ago

eugene-yh commented 2 years ago

In the paper, it is stated that the temperature T has to be a positive number. In the code, however, although the temperature is initialized with a positive number (i.e., self.temperature = nn.Parameter(torch.ones(1) * 1.5)), it seems to me that there is nothing in the code of .set_temperature() to make sure that we do not end up with a negative temperature.

Did I miss something? Or, is it because it can be mathematically proven that the gradient will never push the temperature to the negative side as long as it is initialized to be positive? If not so, should we initialize with something like self.temperature = nn.Parameter(torch.ones(1) * 1.5) ** 2 to ensure that self.temperature is always positive?

pdejorge commented 2 years ago

@eugene-yh Intuitively, I think it should be strange that the temperature takes negative numbers since it would be inverting the prediction of the network (i.e. the most likely class would now be the least likely)

That being said, it actually happened to me that I found a negative temperature in some case. I found that adding a torch.abs(self.temperature) for the closure function worked well. For instance:

def closure():
      optimizer.zero_grad()
      scaled_logits = logits / torch.abs(self.temperature) # Ensure temperature stays positive in optimization.

      if metric == 'ece':
          loss = ece_criterion(scaled_logits, labels)
      elif metric == 'nll':
          loss = nll_criterion(scaled_logits, labels)
      else:
          raise NotImplementedError()

      loss.backward()
      return loss