for multi-class case predict_proba method does not return same number of probabilities as number of classes

chriswbartley / monoensemble

High Performance Monotone Boosting and Random Forest Classification

Other

5 stars 1 forks source link

Thanks Ravi - Yes, the predict_proba returns different probabilities to sklearn: they are cumulative probabilities (I should make this more clear somewhere). For classes 1,2,3,4 the three values would be: [P(y<2), P(y<3),P(y<4)].

You can calculate the class probabilities by: P(y=1) = P(y<2) P(y=2)= P(y<3)-P(y<2) P(y=3) = P(y<4)-P(y<3) P(y=4) = 1 - P(p<4)

From memory, if you are planning on assigning a class, to retain global monotonicity, you need to use a consistent threshold on the cumulative probability, e.g. the 'lowest median' is used in predict() (ie the first C s.t. p(y<C)>0.5)

chriswbartley / monoensemble

for multi-class case predict_proba method does not return same number of probabilities as number of classes #1