Closed MichaelAquilina closed 10 years ago
Naive Bayes (Multinomial) [[76 0] [ 2 72]] Accuracy: 0.986666666667 Precision: 1.0 Recall: 0.972972972973 F1 Measure: 0.986301369863
Naive Bayes (Bernoulli) [[69 6] [ 1 74]] Accuracy: 0.953333333333 Precision: 0.925 Recall: 0.986666666667 F1 Measure: 0.954838709677
Support Vector Machine (Linear) [[73 2] [ 1 74]] Accuracy: 0.98 Precision: 0.973684210526 Recall: 0.986666666667 F1 Measure: 0.980132450331
Support Vector Machine (Radial Basis Function) [[65 9] [ 0 76]] Accuracy: 0.94 Recall: 1.0 Precision: 0.894117647059 F1 Measure: 0.944099378882
Note that a linear kernel seems to provide improved performance because we are already working in such high dimensional space
Random Forest Classifier [[73 1] [ 5 71]] Accuracy: 0.96 Precision: 0.986111111111 Recall: 0.934210526316 F1 Measure: 0.959459459459
K Nearest Neighbors (K=5, metric='minkowski') [[20 57] [ 0 73]] Accuracy: 0.62 Precision: 0.561538461538 Recall: 1.0 F1 Measure: 0.71921182266
K Nearest Neighbors (K=5, metric='euclidean') [[20 55] [ 0 75]] Accuracy: 0.633333333333 Precision: 0.576923076923 Recall: 1.0 F1 Measure: 0.731707317073
Notice how it manages to generate 1 class correctly (full recall) but mixes up the other class as the latter. This could indicate some good performance with some fine tuning, but support vector machines seem likely to be the better choice seeing as they just seem to work so well.
These techniques may prove redundant if the plans for one-class classification prove effective.
Current performance is being tested with a linear Support Vector Machine. It would be smart to evaluate the performance of several classifiers on the dataset:
These are all supported from within sklearn and make use of the same interface as SVC so evaluating their performance should be no issue at all.
The results below are for: 300 Python vs 300 Science (Binary classification - simple) using 1 round of
train_test_split
Initial tests show that the two best performing are:
These correspond to results posted in other papers which is encouraging.