The only difference between our splitters and the ones in scikit-learn are that the ones in scikit-learn leverage an efficient way to track columns that are "constant" with respect to a target y variable. This ensures with 100% chance that any lower-nodes will not split on said constant features.
This can actually affect performance because when max_features is say 0.3, then you might randomly choose 30% of your features and if there is a very high amount of noise, then it's possible at some node depth for some tree, all 30% of those features may be noise and thus result in constant splits. However, currently oblique splitters will still split the samples, rather than stopping.
The only difference between our splitters and the ones in scikit-learn are that the ones in scikit-learn leverage an efficient way to track columns that are "constant" with respect to a target
y
variable. This ensures with 100% chance that any lower-nodes will not split on said constant features.This can actually affect performance because when
max_features
is say0.3
, then you might randomly choose 30% of your features and if there is a very high amount of noise, then it's possible at some node depth for some tree, all 30% of those features may be noise and thus result in constant splits. However, currently oblique splitters will still split the samples, rather than stopping.