Open GoogleCodeExporter opened 8 years ago
bagging has a single parameter - the number of trees. all trees are fully grown
binary tree (unpruned) and at each node in the tree one searches over all
features to find the feature that best splits the data at that node.
randomforests has 2 parameters - the first parameter is the same as bagging
(the number of trees). the second parameter (unique to randomforests) is mtry
which is how many features to search over to find the best feature. this
parameter is usually 1/3*D for regression and sqrt(D) for classification. thus
during tree creation randomly mtry number of features are chosen from all
available features and the best feature that splits the data is chosen. this
random selection is done everytime at a node,
Original comment by abhirana
on 7 Sep 2011 at 12:14
[deleted comment]
hi,Abhirana. Thanks for your reply and also thanks for your great code.
now i have known the differences between randomforests and bagging from your
reply,but i still feel puzzled about the sayings in the matlab(2009a) help
documents.you can see them from two pictures as below.one it's about
'treebagger' and the other is about 'regression and classification by bagging
decision tree'.
in their sayings, the bagging also considers the mtry parameter.is my opinion
right?looking forward to your reply,thanks.
Original comment by hpu2...@163.com
on 7 Sep 2011 at 3:06
Attachments:
bagging only equals using ntree. only Randomforests has ntree and mtry. if they
are putting it both together then baggging == randomforests in their case but
not according to literature.
take a look at leo briemans paper on bagging and randomforests. randomforests
paper http://oz.berkeley.edu/users/breiman/randomforest2001.pdf (pg 9 where its
described)
do note that nowhere does matlab says its implementing randomforests (maybe
because randomforests is a copyrighted term) but the description of their
method is for randomforests
Original comment by abhirana
on 7 Sep 2011 at 3:18
now i'm fully understand the diferences betwwen randomforests and bagging.
thanks very much.
Original comment by hpu2...@163.com
on 7 Sep 2011 at 4:14
hi,Abhirana.i have another question for you help.
in the model of training results,i can see the votes result.but when i use the
model to predict the test data,there is only a result of the classes. i want to
see the votes result of the test data, can it carry out in your code?
thanks.
Original comment by hpu2...@163.com
on 15 Sep 2011 at 5:20
[Y_new, votes, prediction_per_tree] = classRF_predict(X,model, extra_options)
classRF_predict can return class prediction per tree and the total votes per
class and per example.
does that help?
Original comment by abhirana
on 15 Sep 2011 at 5:26
thanks,it's very helpful.
i pay more attention to the classRF-train.m, and didn't pay much attention to
the classRF-predict.m.
Sorry to have caused you so much trouble. thanks again.
Original comment by hpu2...@163.com
on 15 Sep 2011 at 5:46
no problems
glad to help.
Original comment by abhirana
on 15 Sep 2011 at 5:51
Original issue reported on code.google.com by
hpu2...@163.com
on 3 Sep 2011 at 12:59