Closed wcm95 closed 4 years ago
@rasbt Could you please help me with this question
Yes, your reasoning seems to be correct here. Let me know if you have a follow-up question.
@wcm95 why would the predicted category be 3? I'm a bit confused because you have a domain of 6, but the prediction vector has 5 categories. It would seem (if that's a typo) that the first category (.8) is the highest probability so wouldn't the correct prediction reading for that output be a 0?
I think there are different things going on here :). @bneigher, you are right in case of a regular softmax activation output where you use the argmax. In the CORAL model, you treat each task as a different binary task P(X>0), P(X>1), ... . Since it is a monotone function, the earlier tasks will always have a higher probability than the later ones. In CORAL, the class label is predicted by summing the number of tasks for which the probability is greater than 0.5. So in this case
[0.8 -> 1, 0.6 -> 1, 0.55 -> 1, 0.45 ->0, 0.1 -> 0] = 1 + 1 + 1+ 0+ 0 = 3
@rasbt got it! that makes a lot of sense thank you for clarifying that 🚀
Suppose there are 6 categories: 0, 1, 2, 3, 4, 5. The probability output for one sample is [0.8, 0.6, 0.55, 0.45, 0.1]. So the prediction result for this sample will be category 3. My question is, does this mean P(X=3) = P(X>2) - P(X>3) = 0.55- 0.45 = 0.1, the probability of the predicted category is only 0.1?