interpretml / interpret

Fit interpretable models. Explain blackbox machine learning.
https://interpret.ml/docs
MIT License
6.3k stars 734 forks source link

Non-Treebased base models #185

Open JoshuaC3 opened 4 years ago

JoshuaC3 commented 4 years ago

Amazing stuff. From what I can tell, you use simple decision trees as your base estimator. Is it possible to use linear models, polynomial regression models or even cubic splines?

In my experience, Stake Holders prefer smooth, or non-stepped functions when discussing interpretability. They seem to accept them better.

interpret-ml commented 3 years ago

Hi @JoshuaC3,

It's a good question! Right now, we're focused on tree-based models as the base estimators in EBMs. We've found that trees tend to yield the best performance in most cases, and are also easier to use "out of the box" as they naturally adapt to categorical data and are agnostic to the scale of the features.

That being said, we're currently experimenting with some parameters that seem to significantly improve the smoothness of the learned functions. We're planning on exposing them in the package after some more experimentation, and we'll update this thread once they're in.

-InterpretML Team

JoshuaC3 commented 3 years ago

Thanks for that update. Having gone away and understood the algorithm behind this a little better, it makes total sense as to why you would use DTs as your default base learner.

From what I understand, there would be nothing stopping someone* from making their own approximations of the graphs (or a single graph), if they really desired a completely smooth function (spline, poly or even linear).

All that said, I think it would still be a nice feature to be able to include other base learners, even if just for academic purposes.


*Excluding the obvious: effects on model accuracy, compatibility problems, complications with calculating SHAP values and the potential need for retraining weights, etc etc etc.

Garve commented 3 years ago

I would especially look forward to Isotonic Regression as base estimators because then I could model monotone functions with it.

JoshuaC3 commented 3 years ago

@Garve I have been using your ExplainableMetaRegressor with Isotonic regression in a piece of work where I know the variables should have a monotonic relationship with the dependent variable. So far the results have been competitive with InterpretMLs EBMs, and in several cases, the holdout training scores have been slightly better!

Here is a bit of a hack-job to get the mEBMs plotting global explainability: https://github.com/JoshuaC3/scikit-bonus/tree/mEBM-utils

use as follows

# from XXX import utils
# from interpret import show

# your mEBM code
# note, I use mEBM to distinguish between interprets EBM and your Meta EBM

selector = utils.make_selector(X)
feature_names = X.columns.tolist()

mebm_global = utils.explain_global(mebm, feature_names, feature_importances, selector, name=None)

show(mebm_global)

Happy to tidy up, add tests, docs and do PR if desired.

All that said, I think having an Isotonic/Monotonic/Base Models options at the training level IN InterpretML is far more desirable!! 😄

Garve commented 3 years ago

Heya Joshua!

Awesome, will try it out later! :) And I agree, having that in interpretml would definitely be the best! Especially if you can assign a different model for each feature of the dataset.

Best Robert