jonasbarth commented 1 year ago

What we want to do?

Create histogram of oriented gradients and use as input to a classifier, starting with KNN.

How

Preprocessing

The goal of the preprocessing is to create a Pipeline object that we can reuse when using the test data. For every image:

Load image
Downsample (optional) (try 75%, 50%, 25% and evaluate)
Apply Gaussian blur (from opencv or scikit learn)
Create histogram of oriented gradients using hog from skimage for a selection of channels. The output will be a single 2D matrix. (play around with the parameters, pixels_per_cell, cells_per_block) documentation. It IS possible to flatten the image using the boolean feature_vector parameter.
Save histograms in separate directory.

Evaluation

Apply pipeline to test data
Get scores for test data, using confusion matrix from sklearn.
Select different features (channels) and retry. We can use cross validation (using sklearn).

Visualisations

Somehow show the scores of the model. Maybe use the same plot as in the paper

Mamiglia commented 1 year ago

While I was browsing Sklearn documentation I came across the KBinsDiscretizer, which looks like exactly what we need to build histograms:

Also this article might be useful to understand pipelines @MattiaCastaldo

jonasbarth commented 1 year ago

First Results:

channels	n neighbours	orientations	pixels per cell	cells per block	accuracy
1,2,3	3	8	(16, 16)	(1, 1)	0.49
1,2,3	5	8	(16, 16)	(1, 1)	0.51
1,2,3	7	8	(16, 16)	(1, 1)	0.54
6,7,8	3	8	(16, 16)	(1, 1)	0.46
6,7,8	5	8	(16, 16)	(1, 1)	0.5
6,7,8	7	8	(16, 16)	(1, 1)	0.48
13,14,15	3	8	(16, 16)	(1, 1)	0.47
13,14,15	5	8	(16, 16)	(1, 1)	0.47
13,14,15	7	8	(16, 16)	(1, 1)	0.5

We need to play with the hog parameters:

orientations
pixels per cell
cells per block

jonasbarth commented 1 year ago

Actually, it looks like the data I was working with was wrong and/or incomplete. We had a bug in the preprocessing which gave us a lot of empty images. That is now fixed and the scores are much better:

hog_knn_confusion_matrix

martinezvelascojavier commented 1 year ago

When training a Logistic Regression model with HOGs, we find out that, whereas channels RGB and 6,7,8 fail to converge, channels 10,11 and 12 have an "almost perfect" performance:

Looks like very good news :D

Mamiglia commented 1 year ago

This is so cool!

jonasbarth / fds-2022-final-project

Histogram of Oriented Gradients (KNN etc.) #7

What we want to do?

How

Preprocessing

Evaluation

Visualisations