Closed doobidoob closed 4 years ago
Hello @doobidoob !
Can you explain K.log(1.+K.exp(linear_predictor_f))) ?
-> Under the assumption that this problem is binary classification, I used yf(x) - log (1+ f(x))
, which is actually equivalent to the Kendall's one.
And you used log_variance and exponential function when estimating variance. Is there any reason of that? -> Can you give me more specifics? I am happy to know what you mean.
In addition, Is there any inference code for kendall's work? -> Unfortunately, I do not maintain this code now and I do not have it. But I believe you can simply write by passing inputs multiple times. And also, after this work, I figured that the kendall's one can be good if you properly regularize weights in your neural networks.
Hope it helps.
@ykwon0407
Thank you for your reply!
Do you mean yf(x)-log(1_f(x))
is same as binary cross entropy like below?
Honestly, I don't understand why that kind of expression comes out...
It's annoying, but I would be very grateful if you could explain or provide a reference link. :)
In addition, Is there any way to use this loss (yf(x)-log(1_f(x))
) for multi-class classification using softmax (for example, # of class is 5)?
I mean you used log of variance like below, not variance itself. https://github.com/ykwon0407/UQ_BNN/blob/4148245e297c3001148d8fffc3133c58174a46aa/ischemic/models.py#L55 But I understood your intention. It is probably because of the stability of training.
Thank you!
@doobidoob
Yes, I used the one you found. if t_1=1 = y
and s_1 = exp(f(x))/(1+exp(f(x)))
. Then, you will get CE = -log (s_1) = -f(x) + log(1+ exp(f(x))) = -(yf(x) - log(1+ exp(f(x))))
.
Similarly, if t_1 = 0 = y
, then CE = -log (1-s_1) = -0 + log(1+ exp(f(x))) = -(yf(x) - log(1+ exp(f(x))))
.
In the case of multi-class classification, I believe you better to use the original form suggested by Kendall's paper.
Ah! that one is to make sure the standard deviation (equivalently variance) is positive.
@ykwon0407
Thanks for your explanation and kindness! I understand it, Thank you very much 👍
Hi @ykwon0407 , Thank you for your interesting work!
I am working on your paper and Kendall's paper as a reference. I was wondering how your suggestion and Kendall's suggestion were implemented, so I looked at the code. I have several questions about the implementation of Kendall's work.
In Kendall's paper, they proposed loss function for estimating uncertainty in classification task like below:
I found creating logit vector x in
SampleNormal
, but I'm curious about the implementation of loss function like below: https://github.com/ykwon0407/UQ_BNN/blob/4148245e297c3001148d8fffc3133c58174a46aa/ischemic/models.py#L101Can you explain
K.log(1.+K.exp(linear_predictor_f)))
? And you used log_variance and exponential function when estimating variance. Is there any reason of that? In addition, Is there any inference code for kendall's work?Thank you in advance!