tingliu / randomforest-matlab

Automatically exported from code.google.com/p/randomforest-matlab
4 stars 3 forks source link

Nodesize Selection #39

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hi 

Could you please let me know how nodesize would affect the classification 
(Regression) result using RF?

It doesnt mean the higher the nodesize is the more accurate the result would 
be, correct? How can we determine the nodesize?

for number of the trees and the choose of mtry you mentioned some hints so I 
want to know how I can choose a reasonable mtry for classification (Regression)

Thanks,
Saleh

Original issue reported on code.google.com by m.saleh....@gmail.com on 15 May 2012 at 11:02

GoogleCodeExporter commented 8 years ago
the default nodesize is 5 and usually that works best. smaller nodesize will 
create larger trees but may not work towards generalization for regression. 
nodesize = number of examples in the terminal leaf nodes

search mtry=D/10:D/10:D where D is the number of features and do that couple of 
times and average the results. the best mtry should be the one with the lowest 
average ooberr

Original comment by abhirana on 15 May 2012 at 11:07

GoogleCodeExporter commented 8 years ago
The default nodesize for classification is 1, correct? and you dont suggest to 
change it?

Original comment by m.saleh....@gmail.com on 15 May 2012 at 11:16

GoogleCodeExporter commented 8 years ago
nope i wont suggest changing that because unlike regression, in classification 
its always one of the classes. if the nodesize is increased it may cause the 
tree to have multiple examples from multiple classes and at the leaf node may 
have to resort to finding the winner class and i dont think that is a good 
thing to be doing for classification trees

Original comment by abhirana on 15 May 2012 at 11:19