Closed igarnier closed 2 years ago
I am not a fan of the '|>' operators (I have been living without it for years).
If the Sek data structure makes things even faster, then I am OK we start to use it.
You can kill the dead code. Note that you don't need to do any change to this PR. I'll merge it as is then maybe just do some syntactic changes.
this shaves a few more seconds in the classifier case
This PR optimizes tree construction. More precisely, it optimizes the process by which an optimal split of the dataset is computed. In
master
, this split is performed as follows:feature, value
, partition the datasetIn this PR, we proceed as follows:
v1, b1; ...; vn, bn
indexed by values sorted in increasing orderb1, b2 @ ... @ bn; b1 @ b2; b3 @ ... @ bn
etc (usingrev_append
instead of@
)It's probably possible to optimize even further using a data structure with O(1) concatenation instead of lists (eg https://gitlab.inria.fr/fpottier/sek)