Custom / additional scoring of training samples for ANBC and GMM.

Currently, we have a generic classifier that's based on the information gain of the sample (i.e. the negative log of the probability of classifying the sample correctly). This can, I think, be calculated for all classifier types but it's only really useful for those that have a relatively smooth / gradual probability distribution over the feature space. For example, it works well for SVM classifiers (e.g. Touché example).

This approach is not great for the Naive Bayes on a few features (e.g. accelerometer poses). That's because the probability is often either 0 or 1. Here, it might be better to score the samples based on their distance to the predicted class label, rather than the probability.

It's also not very good for DTW, because the probabilities don't tend towards 100% as additional training data is collected. Here we may need to do something very custom, like retraining the model with the new sample added, finding the new exemplar template for the sample's class, and taking the distance to the old exemplar template.

damellis / ESP

Custom / additional scoring of training samples for ANBC and GMM. #264