Evaluate Classifier Performance

chaubold commented 8 years ago

We want to know how much better the transition classifier does with respect to the pure-distance based computation of a "transition probability".

For this, predict the probabilities (add a predictProbability method to the TransitionClassifier) for all validation samples using the trained random forest. Now threshold the probabilities p at some threshold t in [0,1]: e.g. when t=0.3, then every sample with a probability p>t for being a good transition will be classified as positive, otherwise negative. (The Random Forest's predictLabel method does this with t=0.5.)

Do the same with transition probabilities derived from distances as follows:

import numpy as np

def distanceToProbability(distance, transitionParameter=5.0):
    np.exp(-1.0 * dist / transitionParameter)

Then compute precision, recall, and f-measure for each threshold and plot a graph that looks roughly like the following (curves are made up!): img_20151210_115043

LetiP commented 8 years ago

fmeasure

I don't see now, how can it be, that for even weird thresholds, the Transition Classifier keeps such a good result.

chaubold commented 8 years ago

Thanks for generating the plots!

Well, the transition classifier learns to be pretty certain about its decisions: only very few samples have a predicted probability of around 0.5 (see below). download

The distance based classification will definitely be wrong for some samples at all distances where there are true positive and negative examples.

I've put the ipython notebook with some additional graphs in your folder. Let's try this on some other (and much more complicated) datasets tomorrow.

LetiP / Tracking-RandomForest

Evaluate Classifier Performance #4