Error Analysis - Githubissues

Let's break down our data by category to examine which categories are preforming poorly.

Method: 1) For each of the 13 categories: Set aside a random 80% as training and the other 20% as testing 2) Combine each of the 80% training sets for each category into one total training set. 3) Combine each of the 20% testing sets for each category into one total testing set. 4) Train on the 80%-set you just built 5) Test on the 20%-set you just built For each category in this testing set, build a confusion matrix

Some categories don't have enough entries to train/test so we will have to skip those. Hopefully when the Alchemy stuff is done, we can try again :)

SvenAG / SNLP-Final-Project

Error Analysis #2