scikit-learn-contrib / scikit-learn-extra

scikit-learn contrib estimators
https://scikit-learn-extra.readthedocs.io
BSD 3-Clause "New" or "Revised" License
187 stars 42 forks source link

AdaBoostStumpsSampler #124

Closed glevv closed 1 year ago

glevv commented 3 years ago

MC approximation of AdaBoost stump kernel #119

TimotheeMathieu commented 3 years ago

Thank you @GLevV for this, some comments:

Otherwise LGTM, thanks.

glevv commented 3 years ago

They are completely different kernels and methods of computing them. Stump kernel was presented in Support Vector Machinery for Infinite Ensemble Learning, but it could be hard to compute exactly, so in the paper Uniform Approximation of Functions with Random Bases MC approximation was proposed (the same paper where MC approximation of RBF kernel - RBFSampler - is described). I think that StumpKernelSampler/StumpSampler should be shorter and more consistent name (similar to RBFSampler).

As for the scaling, I think it is possible to remove it altogether and let users build their own pipelines (obviously stating in the docs that this method requires scaling). It will be consistent with other kernel methods/approximations (RBFSampler also requires scaling to give proper approximation) and original formulation in the paper.

glevv commented 1 year ago

Closed due to inactivity