SauceCat / PDPbox

python partial dependence plot toolbox
http://pdpbox.readthedocs.io/en/latest/
MIT License
844 stars 129 forks source link

Interpretation of PD Values in a classification setting #21

Closed alexHeu closed 3 years ago

alexHeu commented 6 years ago

Hi,

i was wondering how to exactly interpret the values on the y-axis of the partial dependence plots in the case of a classification. The classifier outputs probabilities between 0 and 1, however, the plots shows negative and positive values which can also be greater than one.

Thanks in advance

SauceCat commented 6 years ago

Because the value of y-axis is not the absolute value of the prediction if you are using the default setting, with center=True. Instead, it represents the difference between the prediction values.

Eduardomar2093 commented 6 years ago

I think my question is somewhat related to this.

In the case of a regression, is it correct to assume that the PDP of each individual feature needs to be aggregated to come up with a prediction of the target ? In other words, is the y-axis in this case the expected contribution of said feature to the overall prediction of the target ?

regression pdp

Also, in the case of an interaction contour plot between two features of a six features regression, what is the interpretation of the contours ? Is it a "normalized" prediction for the other four features not used on the interaction plot ?

regression interaction contour