JuliaAI / DecisionTree.jl

Julia implementation of Decision Tree (CART) and Random Forest algorithms
Other
356 stars 102 forks source link

Add functionality for adding trees to an existing forest #213

Closed ablaom closed 1 year ago

ablaom commented 1 year ago

I'm wanting something like this to support https://github.com/JuliaAI/MLJDecisionTreeInterface.jl/issues/40 (which is related to #211).

One could just run built_forest a second time to get more trees, but currently there is no method exposed for combining the two ensembles.

Maybe it's more intuitive to add a build_forest method that includes an existing forest as argument, which then gets added to. Also, in the case of AdaBoostStumpClassifier, the preceding approach doesn't work.

Thoughts anyone?

rikhuijzer commented 1 year ago

I really like what you said in https://github.com/JuliaAI/DecisionTree.jl/issues/211#issuecomment-1421569136:

For my part, I'd rather prioritise model-generic solutions to solutions to controlling iterative models, which is what MLJIteration does. That way we avoid a lot of duplication of effort.

The second option of growing the forest seems more generic indeed. SIRUS.jl would also need a build_forest(forest, [...]) or grow_forest(forest, [...]) method to work. Sounds also like it could be efficient enough for the custom stopping function that was requested in #211.

What would be the use-cases is the biggest question, I guess. Probably for large models where one would train the model until a certain performance is achieved or when further epochs do not improve performance metrics anymore?

ablaom commented 1 year ago

What would be the use-cases is the biggest question, I guess. Probably for large models where one would train the model until a certain performance is achieved or when further epochs do not improve performance metrics anymore?

Yes, that's the use case.

I'll see about making a PR then, based on option 2.