Stepwise process for model selection with EBM classifier

fcaleca1 commented 3 months ago

Hello. First of all, I want to thank you for the amazing job and package you've put together. I am trying to implement a stepwise procedure to find the optimal subset of covariates (optimal model) for a classification task by using an EBM classifier. For instance, in a simple GLM or GAM we can develop a stepwise process by referring to the AIC (Akaike information criterion). In this context. the optimal model (optimal subset of covariates) corresponds to the situation after which the AIC does not report a significant decrease even after adding new covariates to the model. In conclusion, my question is: can we do the same with EBMs, using AIC as a guideline for the stepwise procedure? If not, what metric should we use as criteria for the process (AUC, Kappa etc etc)? Thank you and looking forward to your suggestions

paulbkoch commented 3 months ago

Hi @fcaleca1 -- If you wanted to build a model one feature at a time, the way to do it would be through the init_score parameter of the fit function. You would build an initial EBM with whatever features you wanted in the base EBM (which could be just 1). To add a new feature you'd create a new EBM and on the call to fit for that EBM you would give it the previous EBM in the init_score parameter. To make predictions you'd need to call predict in the same order that you called fit using the init_score parameter of the predict_proba function. It wouldn't be a very integrated EBM as you'd end up with separate models, but you could still visualize them individually.

fcaleca1 commented 3 months ago

Hi @paulbkoch -- Thanks for your suggestion on how to build a model one feature at a time. What do you think about using Akaike information criterion as a guideline? Is it possible?

interpretml / interpret

Stepwise process for model selection with EBM classifier #505