Perform Naive Bayes on the full data set

trz-maier commented 4 years ago

output before image pre-processing using stratified sampling and a 0.7/0.3 split

amitkparekh commented 4 years ago

Interestingly, it’s very sure about class 2 versus class 1. I assume this has something to do with the fact it’s got a clear line/“thing” over the middle of the sign whereas it’s probably easier to confuse the number 60 and the number 80.

Also, it’s not good at getting the “turn right down” (class 8) signs, but it’s slightly better at the “turn left down” (class 9) signs? I wonder why that is when they are clearly pointing two different directions?

Looking at the true values for class 8 and what it is predicting, it’s often-ish thinking that the turn right down signs could be a lifter sign? This is interesting because the turn right down sign is an arrow from the top-left to bottom-right, whereas the lifter has the mark from top-right to bottom-left. Unless the lifter or the arrows are different?

I think this definitely needs looking into because I would’ve assumed that it would be predicting class 9 as class 3 more, but not class 8 as class 3.

trz-maier commented 4 years ago

@amitkparekh I am hoping these values will improve after we apply some of the image pre-processing you are working on. Nonetheless it would be a good idea to sample some images to see where the model is mistaken. There are some traditional ways of improving Naive Bayes classification which I am already looking into, see #4 Further sub-tasks also ask us to employ a number of very specific methodologies in search for improvement but documenting all of these discrepancies is part of the assignment, too.

I think the lelft-right issue may be related to the uneven distribution of these signs in the population.

amitkparekh commented 4 years ago

From the assignment:

Record, compare, and analyse the classifier's accuracy on different classes (as given by the Weka Summary and the confusion matrix)

~~What exactly does the Weka Summary include? We need to make sure we are including everything for analyses.~~

amitkparekh commented 4 years ago

@trz-maier Given that everything for this specific task is done and you've included the summary and confusion matrix above, is this task done?

trz-maier commented 4 years ago

it will be done when documented. let the owned decide on when to close a task

trz-maier / hwu-dmml-group1

Perform Naive Bayes on the full data set #3