Meeting notes 14/2/2015

We have decided to try the following methods:

Coates
- Common parts (Guido + rest)
- Resizing images
- Extracting random patches from images
- Optionally: whiten
- Output: a 3D tensor with the following dimensions: 0: patches, 1: width, 2: height, we only have one channel so we don't need another dimension for this
- Using k-means (Luc + Tom)
- Run k-means on these patches
- Output: k-means centroids after i iterations
- Using Restricted Boltzmann Machines or using Gaussian Processes (EM) (Robbert + Inez)
- Feed patches into RBM/GP
- Output: RBM hidden weights or GP joint probability (? not sure how to get the feature representations from a GP)
- Common parts (Guido + rest)
- Take the feature representations (either k-means centroids, RBM weights or GP stuff) which is of the form: K x W x H where K is the amount of centroids/hidden neurons that were used.
- Convolve the input training set (extract all possible patches in a systematic way) (+ optionally whiten) and compute the distance between all these patches and the K feature representations. For the distance metric, see Coates' paper. Output is of the form: sqrt(P) x sqrt(P) x K where P is the amount of patches that were extracted and K is the amount of centroids. We can the values in this tensor the 'activations'.
- Pooling: sum the activations in each quadrant (north-west, north-east, south-west, south-east). This results in a matrix: 4 x K, which we flatten into a vector of length 4K. This 4 x K is reached by summing out the patch index from the matrix. I.e. for each centroid, we get 4 values, for each quadrant.
- The vector of length 4K is used (together with the image label) as input in the classifier, any classifier can be used. The scikit-learn SGDClassifier (or SGDRegressor, in our case) is very easy and fast but has some extra hyperparameters. As an alternative logistic regression can be used. Output of the classifier: a trained model. Model prediction output: a vector of length 121 with values indicating the probability of the image being that class.
- Optionally: run Random Forest after this, not necessary though, may improve performance.
- Normalize the probability distribution to sum up to 1 and each probability ranging from 0 to 1.
Convolutional neural networks (deep learning) (Steven + Guido if there is time)

I think the second common part in Coates' method costs the most time to implement. However, if the first common part is done, the k-means group and RBM/GP group can continue working, the second common part can be done last.

I will have my thesis repo cleaned by tomorrow, probably tonight.

StevenReitsma / kaggle-national-data-science-bowl

Meeting notes 14/2/2015 #6