[FEATURE] State Detection: Define suitable single value metric

palisn commented 5 months ago

This is related #7.

A single value metric for more straight-forward model optimization should be introduced. The metric should be a single value composed of multiple weighted desired properties and result in a good estimation of how well the model fits our criteria.

A possible composition might include:

(Binary) Accuracy to encourage a model that predicts most inputs correctly. Though, due to the unbalanced nature of our data, this is likely to push for correct classification of the negative class and not pay much attention to the positive class. Therefore, this should probably not have a high weight if it is included.
Precision to discourage misclassification of negative class samples. It is undesirable to have random key presses when the fingers are resting or moving casually. Thus, it might be reasonable to incorporate it into the metric to some degree, though the same issue as with accuracy applies.
Recall to encourage predicting positive class samples correctly. This can counter the unbalanced nature of the dataset and if along with precision properly weighted determine how more desirable it should be to misclassify positive class samples than negative class samples.
AUC can only be maximized when the decision boundary gets bigger. Hence, incorporating AUC into the metric might force the model to maximize its decision boundary and thus cultivate confidence.

Further considerations concerning the metric include:

A re-definition of positive and negative classes might be beneficial for a representative metric. Specifically, the positive class currently contains state changes, like pressing or releasing a finger, as well as readings where the finger is pressed down. Because the dataset is unbalanced, the learning algorithm will when properly configured give more weight to the positive class. But, a resting finger, regardless which position it is in, should not have as much weight as a finger that is changing its position or performs a gesture.

palisn commented 5 months ago

As a replacement for the AUC, one could also consider some kind of distance function that would promote confidence, like a normalized Euclidean distance.

palisn commented 5 months ago

As a replacement for the AUC, one could also consider some kind of distance function that would promote confidence, like a normalized Euclidean distance.

For now, we opt for the average ~Manhattan distance~ absolute difference from the actual label converted to a suitable metric from 0 to 1 instead of AUC. This is mostly due to complications with the implementation of the metric, where AUC and a normalized Euclidean distance are not reasonably easy to integrate. We decided to finish the metric with our current implementation and allow for future changes because we wanted to choose it quickly.

palisn commented 5 months ago

The combined metric for now is a weighted average with the following composition	Weight (%)	Metric
60	`fbeta` mit $\beta = 2$
20	`binary_accuracy`
10	`precision`
10	`inv_distance`

where, if $\overline{d}$ is the average difference between the prediction and the actual label, inv_distance is defined as $\texttt{inv\_distance} = \text{clip}(1 - 2 \cdot \overline{d}, 0, 1)$.

palisn commented 5 months ago

Commit 42ef6e4db30e485756bef918f6e0615aa341d3b2 resolves this issue.

xjjak / LapCal

[FEATURE] State Detection: Define suitable single value metric #28