Open GoogleCodeExporter opened 8 years ago
Hi Kang
yeh there is a difference between the regression/classification code. when
creating tree you need to split data but before splitting you need to sort data
falling into a node. the classification code uses a pre-sorted array and that
makes the classification code scale as O(number of example) whereas regression
code uses on the fly code and that makes regression code scale as O(nlog(n)) -
best sort code scaling.
i am guessing you have lots of examples and thats one reason regression might
be slower.
the other reason might be that regression trees may be split totally (i.e leaf
nodes have the minimum number of examples) whereas your classification trees
might be much simpler (a low VC dimension)
calculate the mean number of nodes in the model created, that might give you
some more idea
mean(modelRf.ndbigtree) (classification)
mean(modelRf.ndtree)(regression)
Original comment by abhirana
on 27 Sep 2012 at 10:41
Original issue reported on code.google.com by
KangD...@gmail.com
on 27 Sep 2012 at 8:15