devmax / randomforest-matlab

Automatically exported from code.google.com/p/randomforest-matlab
0 stars 0 forks source link

what's the differences between random forest and bagging #19

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
hello,now i 'm studying the randomforest and bagging.these two methods are 
similar,so i  want to know the differences between them.thanks.

Original issue reported on code.google.com by hpu2...@163.com on 3 Sep 2011 at 12:59

GoogleCodeExporter commented 9 years ago
bagging has a single parameter - the number of trees. all trees are fully grown 
binary tree (unpruned) and at each node in the tree one searches over all 
features to find the feature that best splits the data at that node.

randomforests has 2 parameters - the first parameter is the same as bagging 
(the number of trees). the second parameter (unique to randomforests) is mtry 
which is how many features to search over to find the best feature. this 
parameter is usually 1/3*D for regression and sqrt(D) for classification. thus 
during tree creation randomly mtry number of features are chosen from all 
available features and the best feature that splits the data is chosen. this 
random selection is done everytime at a node,

Original comment by abhirana on 7 Sep 2011 at 12:14

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
hi,Abhirana. Thanks for your reply and also thanks for your great code.
now i have known the differences between randomforests and bagging from your 
reply,but i still feel puzzled about the sayings in the matlab(2009a) help 
documents.you can see them from two pictures as below.one it's about 
'treebagger' and the other is about 'regression and classification by bagging 
decision tree'.
in their sayings, the bagging also considers the  mtry parameter.is my opinion 
right?looking forward to your reply,thanks.

Original comment by hpu2...@163.com on 7 Sep 2011 at 3:06

Attachments:

GoogleCodeExporter commented 9 years ago
bagging only equals using ntree. only Randomforests has ntree and mtry. if they 
are putting it both together then baggging == randomforests in their case but 
not according to literature. 

take a look at leo briemans paper on bagging and randomforests. randomforests 
paper http://oz.berkeley.edu/users/breiman/randomforest2001.pdf (pg 9 where its 
described)

do note that nowhere does matlab says its implementing randomforests (maybe 
because randomforests is a copyrighted term) but the description of their 
method is for randomforests 

Original comment by abhirana on 7 Sep 2011 at 3:18

GoogleCodeExporter commented 9 years ago
now i'm fully understand the diferences betwwen randomforests and bagging.
thanks very much.

Original comment by hpu2...@163.com on 7 Sep 2011 at 4:14

GoogleCodeExporter commented 9 years ago
hi,Abhirana.i have another question for you help.
in the model of training results,i can see the votes result.but when i use the 
model to predict the test data,there is only a result of the classes. i want to 
see the votes result of the test data, can it carry out in your code?
thanks. 

Original comment by hpu2...@163.com on 15 Sep 2011 at 5:20

GoogleCodeExporter commented 9 years ago
[Y_new, votes, prediction_per_tree] = classRF_predict(X,model, extra_options)

classRF_predict can return class prediction per tree and the total votes per 
class and per example.

does that help?

Original comment by abhirana on 15 Sep 2011 at 5:26

GoogleCodeExporter commented 9 years ago
thanks,it's very helpful.
i pay more attention to the classRF-train.m, and didn't pay much attention to 
the classRF-predict.m.
Sorry to have caused you so much trouble. thanks again.

Original comment by hpu2...@163.com on 15 Sep 2011 at 5:46

GoogleCodeExporter commented 9 years ago
no problems

glad to help. 

Original comment by abhirana on 15 Sep 2011 at 5:51