jonasbarth / fds-2022-final-project

Final Project for Fundamentals of Data Science 2022.
0 stars 2 forks source link

Histogram of Oriented Gradients (KNN etc.) #7

Closed jonasbarth closed 1 year ago

jonasbarth commented 1 year ago

What we want to do?

Create histogram of oriented gradients and use as input to a classifier, starting with KNN.

How

Preprocessing

The goal of the preprocessing is to create a Pipeline object that we can reuse when using the test data. For every image:

  1. Load image
  2. Downsample (optional) (try 75%, 50%, 25% and evaluate)
  3. Apply Gaussian blur (from opencv or scikit learn)
  4. Create histogram of oriented gradients using hog from skimage for a selection of channels. The output will be a single 2D matrix. (play around with the parameters, pixels_per_cell, cells_per_block) documentation. It IS possible to flatten the image using the boolean feature_vector parameter.
  5. Save histograms in separate directory.

Evaluation

  1. Apply pipeline to test data
  2. Get scores for test data, using confusion matrix from sklearn.
  3. Select different features (channels) and retry. We can use cross validation (using sklearn).

Visualisations

Somehow show the scores of the model. Maybe use the same plot as in the paper image

Mamiglia commented 1 year ago

While I was browsing Sklearn documentation I came across the KBinsDiscretizer, which looks like exactly what we need to build histograms:

Also this article might be useful to understand pipelines @MattiaCastaldo

jonasbarth commented 1 year ago

First Results:

channels n neighbours orientations pixels per cell cells per block accuracy
1,2,3 3 8 (16, 16) (1, 1) 0.49
1,2,3 5 8 (16, 16) (1, 1) 0.51
1,2,3 7 8 (16, 16) (1, 1) 0.54
6,7,8 3 8 (16, 16) (1, 1) 0.46
6,7,8 5 8 (16, 16) (1, 1) 0.5
6,7,8 7 8 (16, 16) (1, 1) 0.48
13,14,15 3 8 (16, 16) (1, 1) 0.47
13,14,15 5 8 (16, 16) (1, 1) 0.47
13,14,15 7 8 (16, 16) (1, 1) 0.5

We need to play with the hog parameters:

jonasbarth commented 1 year ago

Actually, it looks like the data I was working with was wrong and/or incomplete. We had a bug in the preprocessing which gave us a lot of empty images. That is now fixed and the scores are much better:

hog_knn_confusion_matrix

martinezvelascojavier commented 1 year ago

When training a Logistic Regression model with HOGs, we find out that, whereas channels RGB and 6,7,8 fail to converge, channels 10,11 and 12 have an "almost perfect" performance:

image

Looks like very good news :D

Mamiglia commented 1 year ago

This is so cool!