The understanding of the formula in the paper and the details of the training in the code

hendrycks / outlier-exposure

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

Apache License 2.0

541 stars 107 forks source link

The understanding of the formula in the paper and the details of the training in the code #19

Closed yanchenyuxia closed 3 years ago

yanchenyuxia commented 3 years ago

Hello, I am a graduate student in school and I just read this paper of yours. I want to ask you some questions that I don’t understand. First of all, I don't understand the latter part of the formula in the paper.I don’t know what it means.Is f(x') the predicted output of the model on OOD data?Is the cross entropy calculated for the second half? Second, loss += 0.5 * -(x[len(in_set[0]):].mean(1) - torch.logsumexp(x[len(in_set[0]):], dim=1)).mean() I don't understand how this formula shows that the obtained is cross entropy?And where is softmax used in the first half（x[len(in_set[0]):].mean(1)） of the formula?How does the second half of the formula（torch.logsumexp(x[len(in_set[0]):], dim=1)） represent uniform distribution? Thansk,sir!

xfffrank commented 3 years ago

Hi, let me try answering your questions. @yanchenyuxia

f(x') is the predictions on OOD data. Since the OOD data doesn't belong to the training classes, in the second half, it's calculating the cross entropy from f(x') to the uniform distribution.
For your second question, you may refer to the answer in #18 .

yanchenyuxia commented 3 years ago

Hi, let me try answering your questions. @yanchenyuxia

f(x') is the predictions on OOD data. Since the OOD data doesn't belong to the training classes, in the second half, it's calculating the cross entropy from f(x') to the uniform distribution.

For your second question, you may refer to the answer in #18 .

I sent you an email, your Google mailbox! The phone number on my mailbox name is my WeChat ID. I still don't understand these questions, so please guide me. Thank you, sir!

captainfffsama commented 3 years ago

It seems that answer is not clearly,I try show my understand about why loss += 0.5 * -(x[len(in_set[0]):].mean(1) - torch.logsumexp(x[len(in_set[0]):], dim=1)).mean() is compute cross-entropy from softmax distribution to uniform distribution.

we know cross-entropy represent is:

now p(x) is unigorm distribution an q(x) is softmax distribution.

so we rewrite Formula 1 like this:

and x[len(in_set[0]):].mean(1) is first term and torch.logsumexp(x[len(in_set[0]):], dim=1) is second term