Closed muammar closed 1 year ago
Good question!
Actually their purposes are the same. For the disentangle case, the two terms logp(1-u)
and log(1-p)u
will reduce to four individual terms logp
, log(1-u)
, log(1-p)
, and logu
. However, in this case, the variables p
and u
are still correlated in SGD optimization because p
and u
are derived from the same DNN output alpha
. I did not choose the disentangle case by default simply because it does not work better than the other one in experiments. Intuitively, the Eq. (3) in the paper is more like a cross-entropy between p
and u
, which is a more common practice.
You may also refer to this NeurIPS 2020 paper to see how it deal with the AvU calibration: "Ranganath Krishnan and Omesh Tickoo. Improving model calibration with accuracy versus uncertainty optimization. In NeurIPS, 2020."
Thanks for your fast response, I appreciate it. I will use the version with disentangle
set to False
. Thanks for sharing the reference (I had started reading it).
Very nice paper. I've been facing similar issues as the ones reported in this figure when using evidential uncertainty quantification:
In your work, you discuss we could add an Accuracy vs. Uncertainty loss function that looks like this:
You proposed a new version of it, as shown in the equation below:
I could find the implementation of the equation above here:
https://github.com/Cogito2012/DEAR/blob/2a64f6a4be878a52046f043b50311af7316d3c33/mmaction/models/losses/edl_loss.py#L113-L144
It is unclear to me when to use the
disentangle
case. Could you please provide me with any insights about when to use one over the other?Thanks :)