Closed WilliamHeuser closed 9 months ago
I am also getting the following error message if I set the min_samples_leaf too small. Could you investigate?
192 # Stopping Conditions - AFTER:
193 # boolean used to determine wheter 'parent node' is a leaf or not
194 # additional stopping criteria can be added with 'or'
195 # statements
--> 196 N_t_L = len(split[0])
197 N_t_R = len(split[1])
198 is_leaf = (n_samples /
199 n_obs *
200 (impurity -
(...)
206 child_imp[1]) < min_improvement +
207 EPSILON or N_t_L < min_samples_leaf or N_t_R < min_samples_leaf or is_leaf)
IndexError: list index out of range
@svbrodersen fix with checking for len(split) == 0 solves this
FIXED now
The conditions in DepthTreeBuilder for whether or not a node is a leaf can be improved. The min_improvement condition works currently but if we ever have a criteria function that does not have a weighted form this calculation will be incorrect. More information can be found on this comment of the PR: https://github.com/NiklasPfister/adaXT/pull/23#discussion_r1416858721
Furthermore the impurity_tol condition is checked before the split is done, which results in leaf nodes with a lower impurity than the tolerance. Thus the impurity_tol is NOT a lower bound on the training sample error. More information can be found here: https://github.com/NiklasPfister/adaXT/pull/23#discussion_r1416856153