Closed jsherrah closed 10 years ago
In the interests of doing a full first-pass, this is the next issue to address.
We have the parameters for this data set. Need to train a classifier on training+validation features, getting the classifier and adjacency file. Then perform classification and labelling on the test set, and compare against ground truth to get the final accuracy number. How will it compare with the literature I wonder? See issue https://github.com/RockStarCoders/alienMarkovNetworks/issues/21: global 84-87, class average (preferred metric) 77-78 is state of the art.
To start we need a training-plus-validation set. on VM:
cd /vagrant/msrcData
mkdir trainingPlusValidation
cp -r training/* trainingPlusValidation/
cp -r validation/* trainingPlusValidation/
To create the features:
cd /vagrant/alienMarkovNetworks
./createFeatures.py --type=pkl --nbSuperPixels=400 --superPixelCompactness=10 /vagrant/msrcData/trainingPlusValidation /vagrant/features/msrcTrainingPlusValidation_slic-400-010.00
Now training the classifier:
./trainClassifier.py --outfile /vagrant/classifier_msrc_rf_400-10_trainPlusValidation.pkl --type=randyforest --paramSearchFolds=0 --ftrsTest=/vagrant/features/msrcTest_slic-400-010.00_ftrs.pkl --labsTest=/vagrant/features/msrcTest_slic-400-010.00_labs.pkl /vagrant/features/msrcTrainingPlusValidation_slic-400-010.00_ftrs.pkl /vagrant/features/msrcTrainingPlusValidation_slic-400-010.00_labs.pkl --rf_max_features=75 --rf_min_samples_split=10 --rf_n_estimators=500 --rf_max_depth=50 --rf_min_samples_leaf=5
Training set accuracy (frac correct) = 0.959907818091
Test set accuracy (frac correct) = 0.646707651285
Classifying each test image:
./classifyAllImages.sh /vagrant/classifier_msrc_rf_400-10_trainPlusValidation.pkl /vagrant/results/imagesClassified/test /vagrant/msrcData/test/Images/*.bmp
./labelAllImagesGivenProbs.sh /vagrant/features/msrcTrainingPlusValidation_slic-400-010.00_adj.pkl 0.2 /vagrant/results/imagesLabelled/test /vagrant/results/imagesClassified/test/*.pkl
./evalPredictions.py /vagrant/results/imagesLabelled/test/evalpairs.csv /vagrant/msrcData/test ''
And the answer is...
**Avg prediction accuracy=67.8957163724
Over 257 predictions
Which (assuming global) doesn't compare too well with the SOA of 84-87%. We need class average values too.
Next will be a second pass over everything, particularly the image features, to improve accuracy.
Now we have used training to train the classifier, and validation to select hyper-parameters. We want to get the best result on the test set, so now re-train the classifier using those optimal parameters on the training-plus-validation set.