zhangjun001 / randomforest-matlab

Automatically exported from code.google.com/p/randomforest-matlab
2 stars 1 forks source link

Modelling the trees #18

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hi Abhishek Jaiantilal 

Thanks for a great code. Well described!

my dataset is a 50x10000matrix and using classification trees to determine the 
variables with the greatest importance. However RF is know to work as this 
blackbox so, I was wondering...

if there is any way to view each tree and include it into a report? or is the 
closest your "extra_options.do_trace"-function? which outputs:

tree      OOB      1      2      3      4      5      6      7      8      9    
 10     11     12     13     14     15     16     17     18     19     20     
21     22     23     24     25     26     27     28     29     30     31     32 
    33     34     35     36     37     38     39     40     41     42     43    
 44     45     46     47     48
    1:  56.13%  0.00%  0.00% 14.29% -1.#J% -1.#J%100.00%  0.00% 27.27% 72.73% 61.54% 54.55% 71.43% 92.31% 77.78% 90.00% 87.50% 71.43% 70.59% 70.27% 72.41% 71.43% 70.21% 96.30% 96.88% 91.67% 94.12% 94.44% 86.96% 88.24% 90.48% 92.86% 81.82% 84.85% 84.85% 72.62% 51.44% 50.80% 55.32% 56.55% 72.97% 67.01% 50.00% 50.96% 48.45% 27.18% 28.13% 20.45% 27.27% 
(as two rows).

the reason I'm asking is because I'd find this video showing each tree and how 
their importance is, at 4:18:

http://www.youtube.com/watch?v=RE7VO_AB7PI&feature=player_embedded

Thanks again for your program, and writing your citation in another topic.
Regards Thomas

Original issue reported on code.google.com by ThomasJJ...@gmail.com on 9 Jul 2011 at 2:00

GoogleCodeExporter commented 9 years ago
Hi Thomas

sorry for my late reply.

thanks for your helpful comment

will something like the attached m file help? you will need to use graphviz 
(http://www.graphviz.org/) to visualize the tree. After installing graphviz,  
file->new and paste the output. then its F5 to get something like the attached 
figure.

i think the code correctly predicts the tree, though i will debug a bit more 
and see if there is an error or anything.

you will be able to get per tree prediction if you look at 2nd and the 3rd 
output  [Y_new, votes, prediction_per_tree] = classRF_predict(X,model, 
extra_options)

that way its possible to write code to know if permuting a feature increases 
the error rate (aka feature importance)

please do tell me if you need more information.

Original comment by abhirana on 23 Jul 2011 at 10:57

Attachments:

GoogleCodeExporter commented 9 years ago
Hi Abhishek Jaiantilal 

Thank you for a comprehensive answer. It was easy to install and apply when 
logging the command window in MATLAB, however I had trouble plotting the tree 
as vector format, even though the it's set to output as pdf. 
But I got what I needed! 

Thank you once again.
/Thomas

Original comment by ThomasJJ...@gmail.com on 8 Aug 2011 at 12:44

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Hello Sir,

I tried to graphically view one of the tree of RF trained with mixed data (ie. 
both categorical(feature 1-10) having in total 8 categories in each features 
and numerical (features 11-20)) using above given code. 

And the tree obtained is classifying as features 6<=13 or feature 6>13. 
Similarly for feature 8 classifying on the value 12. (the obtained tree is 
attached)

the categorical feature only had values between 1-8 then why it is classifying 
on the value grater than 8. 

Thank You

Original comment by Shalini1...@iiitd.ac.in on 27 Jun 2013 at 7:53

Attachments:

GoogleCodeExporter commented 9 years ago
sorry about my reply

if possible could you post the code//data that you used

tree modeling was designed using numerical features. maybe that is the reason.

Original comment by abhirana on 3 Jul 2013 at 11:21