Classification Validation

GoogleCodeExporter commented 9 years ago

Hi

Im trying to use RF to do pixel classification for images of size 101*101 
pixels.

There are 18 features corresponding to each pixel and the number of classes is 
3. Also, my dataset contains 70 images.

Reading Leo Breiman and Adele Cutler website: 

http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

They said: "In random forests, there is no need for cross-validation or a 
separate test set to get an unbiased estimate of the test set error. It is 
estimated internally, during the run"

I was going to use k-fold cross validation (if k =5, build my model using 65 
images and test the model on 5 left over images)to validate RF classification 
result which time consuming but seems like I dont need to do that.

But I noticed that in tutorial_ClassRF.m you split your dataset into two 
classes of training and test and after building the model, you run the model on 
the test set. Could you please clarify this? How can I use this property of 
random forests via your code?

Best,
Saleh

Original issue reported on code.google.com by m.saleh....@gmail.com on 11 May 2012 at 2:20

GoogleCodeExporter commented 9 years ago

the forest is an ensemble of multiple trees and each tree is constructed by 
sampling with replacement (bagging) the training examples that means that about 
63.6% of training data is used for constructing each tree (but different trees 
have a different example training set due to bagging)

RF uses this property that each tree has about 100-63.6% of data not used for 
training used as validation set and that is used to predict the out of bag 
examples or OOB (search that in the tutorial file)

usually 5x2CV or 10CV is the standard when giving an algorithms performance and 
the OOB idea is limited to classifiers that uses an ensemble + bagging. so 
people usually use (like in SVM) training into training + validation and then 
choose the best model on validation and then use those parameters to train a 
single model on training and predict on test
whereas if you are using RF, you dont have to create a validation set, use all 
the training data to create models and find the best model with lowest oob 
error and then use that model to predict on test. usually i use all the 
training and set a fixed ntree=1000 and search over multiple mtry values 
mtry=D/10:D/10:D (where D = number of features) and choose the model that had 
the lowest ooberr and use that model to predict on test.

comparing it to svm, i would ideally create 10 different folds, randomly pick 
one fold for validation, 8 for training and one for test, then parameterically 
search over varoius kernels etc by creating models on training and predicting 
on validation. once i find the best model parameter i create a single model 
using training + validation and then predict on the test fold. and do it lots 
of time

Original comment by abhirana on 11 May 2012 at 4:51

GoogleCodeExporter commented 9 years ago

i meant that for 10CV and SVM i would do the following

i would ideally create 10 different folds, randomly pick one fold for 
validation, 8 for training and one for test, then parameterically search over 
varoius kernels etc by creating models on training and predicting on 
validation. once i find the best model parameter i create a single model using 
training + validation and then predict on the test fold. and do it lots of time

Original comment by abhirana on 11 May 2012 at 5:00

GoogleCodeExporter commented 9 years ago

Thank you for your explanations. I just want to make sure that I understand 
your meaning:

I still needs to have a separate test set to test the best model on that but 
since the validation part is doing internally in RF, I dont need to have 
training + validation set and can use all the training for training, correct? 
If so, I still am confused about what Breiman website says regarding no need of 
a separate test set.

I want to compare the result of RF with fKNN classifier on my data set. For 
fKNN I leave on subject(101*101 pixels) out to validate the accuracy and use 69 
subject (69*101*101 pixels) as training set. In order to do a fair comparison, 
is that correct if I do the same method to create the best model using training 
set and test the model on the one that is left out and do the same things for 
every other subjects?

Sorry but I still cannot understand how can I evaluate the best model without 
using any separate test set as its said in Breiman website.

Appreciate your help and time.

Original comment by m.saleh....@gmail.com on 11 May 2012 at 7:22

GoogleCodeExporter commented 9 years ago

I still needs to have a separate test set to test the best model on that but 
since the validation part is doing internally in RF, I dont need to have 
training + validation set and can use all the training for training, correct? 
If so, I still am confused about what Breiman website says regarding no need of 
a separate test set.

- yup this is correct. breiman showed that ooberr gives an upper bound on the 
validation set. the reason why a test set is not required is because the 
results on validation, on ooberr are similar to that on the test set and 
usually RF behaves nicely to like 1000 trees and the default mtry parameter and 
maybe that is why they say of not needing a separate test set. but for 
publishable results and to be equivalent to reporting on other classifiers it 
is important to do a training + (validation) + test split

I want to compare the result of RF with fKNN classifier on my data set. For 
fKNN I leave on subject(101*101 pixels) out to validate the accuracy and use 69 
subject (69*101*101 pixels) as training set. In order to do a fair comparison, 
is that correct if I do the same method to create the best model using training 
set and test the model on the one that is left out and do the same things for 
every other subjects?

- yeh or instead of leave one out go with 5x2CV or 10fold CV, that might be 
faster, and also make sure that the splits using to train/test knn are the same 
split used for training/testing RF, so that you can do some paired testing on 
the results.

Sorry but I still cannot understand how can I evaluate the best model without 
using any separate test set as its said in Breiman website.

- well, lets say you do not fix any kind of parameter of RF except set it to 
1000 trees and the default mtry value and then create a bunch of trees, for 
each tree part of the dataset is not used for training (due to bagging). now 
use the individual trees to predict on all examples that were not used for 
training those trees and then take the ensemble votes on those examples (out of 
bag for trees) and report those results.  now consider what you do with typical 
classifiers, you will ideally create a training/test split, now divide that 
training set into training+validation to pick the best parameter and then 
create a single model with the training set and the best parameter and then use 
that model to predict on test. you will do this tons of time and report the 
final test error. this is no different than an individual tree in the forest 
which trains on a unique dataset and predicts on a held out set and does it for 
a ton of different trees. the only difference is because its an ensemble it 
takes the final votes over held out examples at the very end. and some research 
has shown that an held out validation and ooberr tend to be similar and if you 
take all the data in your dataset then the ooberr=tsterr

Original comment by abhirana on 11 May 2012 at 8:01

GoogleCodeExporter commented 9 years ago

Thank you so much for the clarification. I guess now I have better 
understanding of what Breiman said. If I use the whole dataset for training and 
compute the obberr that would be similar to test error but in order to publish 
the result of classification as the classifier accuracy and compare with other 
classifiers I'd better do a CV.

Thanks again!

Original comment by m.saleh....@gmail.com on 11 May 2012 at 9:27

GoogleCodeExporter commented 9 years ago

Original comment by abhirana on 19 Dec 2012 at 9:07

Changed state: Done

tjucxq / randomforest-matlab

Classification Validation #36