Training and validation of an image classifier pipeline

elboyran commented 7 years ago

Follow example Image Category Classifier but also use own scripts. Use some default choices like SURF like 80% of the strongest features. multi-class SVM, but also own parameters:

vocabulary sizes of 10, 20 and 50
SURF locations 'Detector' and 'Grid' (default settings GridStep is [8 8] and the BlockWidth is [32 64 96 128 (> 100??)] )

Apply on data-set 6, 100px = 80m.

[x] make script skeleton
[x] Balance categories
[x] Looping over some parameters
[ ] Implement cross-validation
[x] Make better output
[x] Publish
[x] Determine the best classifier from the tested vocabulary sizes and SURF location points
[x] Run with best classifier options and save the model, publish.
[x] Share with partners

elboyran commented 7 years ago

To balance the categories use:

minSetCount = min(tbl{:,2}); % determine the smallest amount of images in a category

% Use splitEachLabel method to trim the set. imds = splitEachLabel(imds, minSetCount, 'randomize');

% Notice that each set now has exactly the same number of images. countEachLabel(imds)

elboyran commented 7 years ago

Conclusions:

BoVW training with different vocabulary sizes doesn't affect the number of features (?), but only the number of iterations. The number of features when using Detector locations are 104016 and using Grid is 4867200.
Training the classier (muli-class linear SVM) with the SURF features from the previous step
Evaluation of the classifier's performance. The used measures are:

Confusion matrix (for now normalized)
accuracy
sensitivity
specificity
precision
recall
Fscore

Main conclusions:

The performance on the training and testing sets is comparable.
The 'Detector' interest point locations are better than using 'Grid' locations.
Larger vocabulary size gives better results

==>> From the tested the best is BoVW = 50 and 'Detector'

For full deatils see C:\Projects\DynaSlum\Results\Classification3Classes\PerformanceComparision\html_classifier

DynaSlum / SatelliteImaging

Training and validation of an image classifier pipeline #37

Conclusions:

Main conclusions: