scikit-learn-contrib / scikit-learn-extra

scikit-learn contrib estimators
https://scikit-learn-extra.readthedocs.io
BSD 3-Clause "New" or "Revised" License
187 stars 42 forks source link

Add AdaBoost Stump Kernel approximation #119

Closed glevv closed 1 year ago

glevv commented 3 years ago

In "Uniform Approximation of Functions with Random Bases" by A. Rahimi and Benjamin Recht [1] RBF approximation (RBFSampler in sklearn) as well as 2 other approx kernels is described. Random stumps seems very easy to implement and could be beneficial in ensembles and stacks or with models with support of L1 penalty which will add feature selection property. In figure below I compared MAE/Fit_time/Predict_time of RandomStumps+Ridge(fit_intercept=False) and ExtraTreesRegressor(max_depth=1, n_jobs=4, max_features=1) on make_regression dataset with n_samples=100_000 and n_features=1000 with 90-10 split train-test. n_jobs_4

With n_jobs=1 in ETR, fit times are identical. n_jobs_1

If it is something you want to add to scikit-learn-extra, I would be happy to contribute.

glevv commented 3 years ago

Here is a comparison between decision functions of ExtraTreesClassifier and RandomStumps+RidgeClassifier.

decision_func

And graph for Balanced Accuracy/Fit time/Predict time for ExtraTreesClassifier and RandomStumps+RidgeClassifier (conditions are the same as regression case above). example_cls

glevv commented 1 year ago

Closing due to inactivity