bgreenwell / pdp

A general framework for constructing partial dependence (i.e., marginal effect) plots from various types machine learning models in R.
http://bgreenwell.github.io/pdp
91 stars 12 forks source link

Feature Request: 95% confidence intervals around a PDP? #96

Closed DeFilippis closed 4 years ago

DeFilippis commented 5 years ago

I'm wondering if it's possible to get 95% confidence bands around the predictions in a PDP plot using this library?

It looks like this is possible using parDepPlotimplement: https://github.com/cran/interpretR/blob/master/R/parDepPlot.R

bgreenwell commented 5 years ago

Yes it is, but I hesitate calling them confidence bands. Nonetheless I wrote a small article on how to accomplish this here: https://bgreenwell.github.io/pdp/articles/pdp-se.Rmd.html. Let me know if you have any questions!

DeFilippis commented 5 years ago

This is great! Thanks for this. Do you know if there is anyway to get more standard confidence bands -- perhaps with bootstrapped' standard errors?

I'm not exactly sure how this package implements it, but it looks like they have something like this: https://cornelllabofornithology.github.io/ebirdst/reference/plot_pds.html

It looks like they use "gam pointwise ci for conditional mean estimate via bootstrapping"

https://github.com/CornellLabofOrnithology/ebirdst/blob/c1ba68948b728a8434f7ca578eb6e621b15afe42/R/ebirdst-plotting.R

bgreenwell commented 5 years ago

I don't unfortunately, nor have I seen any literature that supports this. I would argue that PDPs arise from grouped data (think about each ICE curve as a group), and so traditional confidence interval procedures don't seem like the right approach here. Would be interesting to see any work in this area.