Log loss checking - Githubissues

gngdb commented 9 years ago

We need to find which classes we're worst at on the validation set (specifically not the test set). To do this we need to be able to visualise well (in an IPython notebook probably) for a given set of predictions on the test set (could save these in pickle or csv and load in for code that is agnostic to model). In the same notebook probably worth having Hinton diagrams for confusion matrices.

The idea with this is that we should be able to look at these difficult classes and work on some feature engineering (in the training set) to patch our model and slightly improve our score.

gngdb commented 9 years ago

In this notebook would also want to see a distribution of log loss over classes for the validation set.

gngdb commented 9 years ago

Most of this can be found in the notebook called Validation scoring.py. But, it isn't really an exhaustive analysis yet, just looking at some examples where the network scores badly. Needs more work to come up with some kind of boosted addition to our model.

Neuroglycerin / neukrill-net-work

Log loss checking #43