Neuroglycerin / neukrill-net-tools

Tools coded as part of the NDSB competition.
MIT License
0 stars 0 forks source link

Confusion Matrix Clustering #104

Open scottclowe opened 9 years ago

scottclowe commented 9 years ago

We want to automate clustering of classes together to try to improve classification accuracy with a hierarchical model.

To do this, we will run cross-validation and get predictions, compute the confusion matrix, and cluster the confusion matrix into "larger squares".

So we need a function which, given a confusion matrix, clusters it appropriately.

scottclowe commented 9 years ago

Dragos found out there is a function in SciPy which will be useful for this, scipy.cluster.hierarchy.linkage. http://docs.scipy.org/doc/scipy/reference/cluster.hierarchy.html

See example usage here: https://stackoverflow.com/questions/18770348/hierarchical-clustering-from-confusion-matrix-with-python

scottclowe commented 9 years ago

We should set up the confusion matrix visualisation first so we can see the clustering is being done correctly.

scottclowe commented 9 years ago

Sklearn supports the same kind of thing http://scikit-learn.org/stable/modules/clustering.html#hierarchical-clustering

scottclowe commented 9 years ago

Doesn't seem to be very useful, as the "hierarchy" from the classifier is pretty much just one-vs-rest repeated 121 times.

scottclowe commented 9 years ago

For sklearn, taking the feature vector from each image and clustering these might work better. I can't see a good solution for automated hierarchy for the CNN.

I don't think this issue is a priority.