pbiecek / breakDown

Model Agnostics breakDown plots
https://pbiecek.github.io/breakDown/
103 stars 16 forks source link

how do you explain the output of breakDown #16

Open Nadu123 opened 6 years ago

Nadu123 commented 6 years ago

Hi I am following the example for Random Forests : https://pbiecek.github.io/breakDown/articles/break_randomForest.html I am still not sure how to translate the output of breakDown. in the example about random forests we get image what can be said about the final prediction? is it that for this employee there is 88% probability that she left and that is because of each feature contribution to the prediction.

thanks

pbiecek commented 6 years ago

Random forest said that the probability for this employee is 88% (= final prediction) The average prediction from random forest is 14.8% (=intercept) the increase in odds for leaving is attributed to variables with the use of algorithm described in https://arxiv.org/abs/1804.01955

Nadu123 commented 6 years ago

Thanks for the rocket quick reply! One more thing, whether we use step up or step down strategy, aren't the features supposed to be ordered in terms of their contribution? For the above case , Aren't we saying that the leading cause for this particular employee to be 88% leaving is that the number of projects assigned to her is only 2? since that has the more weight among other features.

thanks

mylanhong commented 6 years ago

@pbiecek hello,the algorithm described in https://arxiv.org/abs/1804.01955 was difficult for me to undestand. I have a question, for a logistic regression model (e.g, the Y is good or bad), what does it mean when the final_decision is a negative value ?

pbiecek commented 6 years ago

Presented scores are logits, logits can be transformed to probability Negative logit means that corresponding probability is below 0.5 read more here: https://en.wikipedia.org/wiki/Logistic_regression