Project Update 2 - Githubissues

Week Summary This week, Jasmijn re-did the pre-processing to fit it better to our needs. We shuffled the data to randomize it and split it in the pre-processing process in stead of seperately in the machine learning notebook.

Floor improved the machine learning notebook. With the Naive Bayes classifier, she found a accuracy of .659. Compared to the accuracy we would have had with 'luck' (1/5=0.2), it was quite okay, but we are aiming to improve this number.

Lanie added plmi measure for the co-occurences of each class, as well as measures for average positivity and negativity for each class. Also, the positivity and negativity scores of each review were calculated. We are going to plot those to see if we they're distinguishable per class to do machine learning on them.

floorkouwenberg / TMCI_project

Project Update 2 #2