szilard / GBM-perf

Performance of various open source GBM implementations
MIT License
215 stars 28 forks source link

Add Microsoft EBM (Explainable Boosting Machine) #28

Open mratsim opened 5 years ago

mratsim commented 5 years ago

Microsoft pre-released the Explainable Boosting Machine 2 weeks ago:

https://github.com/microsoft/interpret

It has very promising performance profile

Dataset/AUROC Domain Logistic Regression Random Forest XGBoost Explainable Boosting Machine
Adult Income Finance .907±.003 .903±.002 .922±.002 .928±.002
Heart Disease Medical .895±.030 .890±.008 .870±.014 .916±.010
Breast Cancer Medical .995±.005 .992±.009 .995±.006 .995±.006
Telecom Churn Business .804±.015 .824±.002 .850±.006 .851±.005
Credit Fraud Security .979±.002 .950±.007 .981±.003 .975±.005
Laurae2 commented 5 years ago

Note: EBM is a GLM with a quadratic design matrix, not a GBM. Not in the same category.

szilard commented 5 years ago

Thanks @mratsim for the info, I did not know about this lib. It's not GBM/GBDT as @Laurae2 was saying, but it's an interesting project for me for other reasons (it's based on Caruana's paper that I know and love https://www.microsoft.com/en-us/research/wp-content/uploads/2017/06/KDD2015FinalDraftIntelligibleModels4HealthCare_igt143e-caruanaA.pdf ). So I'll try out this (thanks again for posting here), but will not add the results to GBM-perf as this is not a GBM.

paulbkoch commented 4 years ago

I'm one of the InterpretML developers. It's true that the final output of the EBM algorithm is a GAM, however internally we use gradient boosted decision trees to build those GAMs. This is probably why they are competitive vs other gradient boosting machine packages.

More info at: http://www.cs.cornell.edu/~yinlou/papers/lou-kdd12.pdf

szilard commented 4 years ago

Thanks @paulbkoch