modify createFeatures.py to handle MSRC dataset protocol

jsherrah commented 10 years ago

Rather than randomly sampling images to form training-val-test, needs to respect the file listing given by Shotton et al.
Needs to use just the 21 classes.

Ant I think the most generic/elegant way of doing this is we manually/scriptly put the training, val and test images into separate sub-directories, then give command-line options to createFeatures.py for the 3 directories. It's a bit safer too, so we don't accidentally mix image sets in the code. i.e. you can really hold out the test set by removing it, to make sure we didn't cheat, as a safety check.

amb-enthusiast commented 10 years ago

I've added a bit of logic to createFeatures.py so that when only a input dir and output dir is specified, it is assumed that all the images in the input dir will be processed into a single feature dataset. Also, when saving the results, I've added a boolean check so that if we have the default split, no "train"/"test"/"cv" tag is added to the output file name.

I've also commented out "horse" and "mountain" classes from pomio.py msrc_classToRGB map.

I've run this on my machine with a "mini MSRC" dataset directory (34 images): python createFeatures.py "/home/amb/dev/mrf/data/miniMSRC" "/home/amb/dev/mrf/classifiers/data/superpixel" --type "csv"

FeatureGenerator.py didn't barf, saving results to file worked, and I'm currently training a classifier: python trainClassifier.py "/home/amb/dev/mrf/classifiers/data/superpixel_ftrs.csv" "/home/amb/dev/mrf/classifiers/data/superpixel_labs.csv" "/home/amb/dev/mrf/classifiers/logisticRegression/superpixel/logReg_miniMSRC.pkl" --type "logreg"

jsherrah commented 10 years ago

great!

RockStarCoders / alienMarkovNetworks

modify createFeatures.py to handle MSRC dataset protocol #29