Open evapachetti opened 2 years ago
I think the following approach should work:
CrossEntropyLoss
with BCEWithLogitsLoss
self.temperature
to number of classesSigmoid can be regarded as a special case of softmax where one of the logits is 0: $\frac{e^x}{e^0 + e^x}$. Then we only need to learn the temperature for each class.
My network is trained according to a binary classification approach, so the model outputs as a single logit value which I then convert into probability by applying the sigmoid function. How can I modify the temperature scaling code to apply it to my network?
Thank you in advance.