tjucxq / randomforest-matlab

Automatically exported from code.google.com/p/randomforest-matlab
0 stars 0 forks source link

about the categorical feature #49

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
hi,abhirana .
Thanks for your nice code.
I am not sure how you treat categorical features.I mean if there exist some 
categorical features in my dataset, how could I transfer them into numerical 
ones that can use your package.
Kindly guide me, please.

Original issue reported on code.google.com by zhangleu...@gmail.com on 16 Nov 2012 at 6:27

GoogleCodeExporter commented 9 years ago
just convert all categories to unique classes so say you have {'yes', 
'no','maybe'} those will have numeric values 0,1,2 respectively. You can then 
specify that particular feature to be a categorical feature or a numeric 
feature. 

look at the categorical test code example
http://code.google.com/p/randomforest-matlab/source/browse/trunk/RF_Class_C/tuto
rial_ClassRF.m#256

you will have to use the svn source to get the categorical data working maybe.

Original comment by abhirana on 16 Nov 2012 at 7:17

GoogleCodeExporter commented 9 years ago
Thank you for your reply.
Could I understand it as following:
There weill be a vector(extra_options.categorical_feature ) indicate that which 
feature is categorial one,if I convert the categorial featur(which is the third 
feature in the original feature space)into numerical feature, then the third 
entry of 'extra_options.categorical_feature ' will be 1 and the rest be zero?

Original comment by zhangleu...@gmail.com on 16 Nov 2012 at 7:21

GoogleCodeExporter commented 9 years ago
yup, exactly, the extra_options.categorical_feature is of size 1xD and if the 
corresponding vector value is 0 then its numerical and if its 1 then its 
categorical

Original comment by abhirana on 16 Nov 2012 at 7:39

GoogleCodeExporter commented 9 years ago
yeah, I still want to know how random Forest handle this categorial 
attributes.What is the difference between the categorial feature and 
non-categorial features?
Could you provide some details here? 
Thanks.

Original comment by zhangleu...@gmail.com on 16 Nov 2012 at 8:07

GoogleCodeExporter commented 9 years ago
as RFs are based on the CART algorithm, you can look into how CART 
distinguishes between categorical and non-categorical features.

Original comment by abhirana on 16 Nov 2012 at 8:33

GoogleCodeExporter commented 9 years ago
Ok.Thanks.

Original comment by zhangleu...@gmail.com on 16 Nov 2012 at 10:06