Closed joachimder closed 4 months ago
Try not to use calibration if you have too few trees, but the result will be inaccurate. Studies on the impact of the number of trees on the precision of the CI estimate is welcome, but it is hard to get general rules that are not problem-dependent.
When I use this on my Random forest model it only works when I use 200 or more trees in the parameters.
n_trees = 200 forest = RandomForestRegressor(n_estimators=n_trees, random_state=42)
my optimal amount of trees is 29 but then (and not with the 200 trees) I get a warning: RuntimeWarning: invalid value encountered in true_divide g_eta_main = g_eta_raw / sum(g_eta_raw) I also can't get a confidence interval around my predicted points because of the error.
I'm a missing something, why you need such an excessive amount of trees?
Edit: minimum amount of trees needed to make it work