neurodata / scikit-tree

Scikit-learn compatible decision trees beyond those offered in scikit-learn
https://docs.neurodata.io/scikit-tree/dev/index.html
Other
54 stars 13 forks source link

ENH Multiview axis-aligned more like sklearn #241

Closed adam2392 closed 2 months ago

adam2392 commented 3 months ago

Related to #226 and #235

The multi-view axis-aligned splitter used the inheritance scheme of oblique splitters to implement the method for sampling multiple feature sets. However, this comes with some downsides. It does not track constant features and inherently requires you to sample unnecessarily a projection vector within each split node.

Tracking constants can help improve runtime as during DFS/BFS building of the tree, subtrees are not explored if all are constant. Moreover, not sampling a projection vector may help the segfault issue related to RAM spikes due to constantly initializing and pushing indices/weights related to the projection.

Changes proposed in this pull request:

Before submitting

After submitting