Closed ibnesayeed closed 7 years ago
looks pretty good so far
Now we are generating k-fold validation accuracy report that looks like this:
------------------ Stats ------------------
Run Total Correct Incorrect Accuracy
-------------------------------------------
1 500 486 14 0.97200
2 500 500 0 1.00000
3 500 499 1 0.99800
4 500 495 5 0.99000
5 500 500 0 1.00000
6 500 499 1 0.99800
7 500 498 2 0.99600
8 500 499 1 0.99800
9 500 498 2 0.99600
10 500 498 2 0.99600
-------------------------------------------
All 5000 4972 28 0.99440
Now printing confusion matrix over the total sample set along with the accuracy stats of accumulated and individual runs or k-fold validation. The code is capable enough to print confusion matrix of individual runs, but that would be overwhelming output. Many methods are defined in a way that they can produce meaningful output for one or more instances of conf_mat
objects passed.
The code is written with multi-class analysis in mind (not just binary). That's why we are only printing the confusion matrix, but not the confusion table (the one that has TP/TN/FP/FN stats) as it would require the classes to be binary and some way to tell which class is considered positive. Perhaps we can provide a parameter so that user can tell the name of the positive class in binary classes then we can conditionally generate more statistics on the data. However, we can still calculate precision and recall for each class without any supplementary information (that would be my next task).
$ rake validate
------------------ Stats ------------------
Run Total Correct Incorrect Accuracy
-------------------------------------------
1 500 486 14 0.97200
2 500 499 1 0.99800
3 500 499 1 0.99800
4 500 498 2 0.99600
5 500 496 4 0.99200
6 500 499 1 0.99800
7 500 500 0 1.00000
8 500 499 1 0.99800
9 500 497 3 0.99400
10 500 499 1 0.99800
-------------------------------------------
All 5000 4972 28 0.99440
---------------- Confusion Matrix -----------------
Predicted -> Ham Spam Total
---------------------------------------------------
Ham 4307 20 4327
Spam 8 665 673
---------------------------------------------------
Total 4315 685 5000
I think I have got an idea, we can report stats for each class as the positive class. This will be one versus all situation repeated for all classes.
I'll defer to your judgement here, as this is a bit out of my wheelhouse.
Now reporting confusion matrix with various derived stats for each class treated as positive class one at a time. The code is refactored in a way that it can be reused if one knows the positive class and wants to generate reports only for that.
$ rake validate
------------------ Stats ------------------
Run Total Correct Incorrect Accuracy
-------------------------------------------
1 500 485 15 0.97000
2 500 497 3 0.99400
3 500 497 3 0.99400
4 500 497 3 0.99400
5 500 499 1 0.99800
6 500 497 3 0.99400
7 500 500 0 1.00000
8 500 499 1 0.99800
9 500 499 1 0.99800
10 500 498 2 0.99600
-------------------------------------------
All 5000 4968 32 0.99360
---------------- Confusion Matrix -----------------
Predicted -> Ham Spam Total
---------------------------------------------------
Ham 4305 22 4327
Spam 10 663 673
---------------------------------------------------
Total 4315 685 5000
# Positive class: Ham
Total population : 5000
Condition positive : 4327
Condition negative : 673
True positive : 4305
True negative : 663
False positive : 10
False negative : 22
Prevalence : 0.8654
Specificity : 0.9851411589895989
Recall : 0.9949156459440721
Precision : 0.9976825028968713
Accuracy : 0.9936
F1 score : 0.9962971534367044
# Positive class: Spam
Total population : 5000
Condition positive : 673
Condition negative : 4327
True positive : 663
True negative : 4305
False positive : 22
False negative : 10
Prevalence : 0.1346
Specificity : 0.9949156459440721
Recall : 0.9851411589895989
Precision : 0.9678832116788321
Accuracy : 0.9936
F1 score : 0.9764359351988218
----------------------- Confusion Matrix ----------
Predicted -> Ham Spam Total
---------------------------------------------------
Ham 4307 20 4327
Spam 6 667 673
---------------------------------------------------
Total 4313 687 5000
Confusion matrix now also reports class-wise precision and recall in last row and last column respectively. Although, not tested yet, but all the functionalities implemented so far should work in multi-class datasets equally well.
----------------------- Confusion Matrix -----------------------
Predicted -> Ham Spam Total Recall
----------------------------------------------------------------
Ham 4307 20 4327 0.99538
Spam 6 667 673 0.99108
----------------------------------------------------------------
Total 4313 687 5000
Precision 0.99861 0.97089
This is what a typical validation task result now looks like.
$ rake validate
/usr/local/bin/ruby -w -I"lib:lib" -I"/usr/local/bundle/gems/rake-12.0.0/lib" "/usr/local/bundle/gems/rake-12.0.0/lib/rake/rake_test_loader.rb" "test/validators/classifier_validation.rb"
# ClassifierValidation
===================== lsi_classifier_5_fold_cross_validate =====================
TODO: LSI is not validatable until all of the [:train, :classify, :categories] methods are implemented!
--------------------------------------------------------------------------------
================ bayes_classifier_10_fold_cross_validate_memory ================
------------------ Stats ------------------
Run Total Correct Incorrect Accuracy
-------------------------------------------
1 500 484 16 0.96800
2 500 489 11 0.97800
3 500 490 10 0.98000
4 500 484 16 0.96800
5 500 487 13 0.97400
6 500 489 11 0.97800
7 500 489 11 0.97800
8 500 488 12 0.97600
9 500 488 12 0.97600
10 500 491 9 0.98200
-------------------------------------------
All 5000 4879 121 0.97580
----------------------- Confusion Matrix -----------------------
Predicted -> Ham Spam Total Recall
----------------------------------------------------------------
Ham 4230 97 4327 0.97758
Spam 24 649 673 0.96434
----------------------------------------------------------------
Total 4254 746 5000
Precision 0.99436 0.86997
# Positive class: Ham
Total population : 5000
Condition positive : 4327
Condition negative : 673
True positive : 4230
True negative : 649
False positive : 24
False negative : 97
Prevalence : 0.8654
Specificity : 0.9643387815750372
Recall : 0.9775826207534088
Precision : 0.9943582510578279
Accuracy : 0.9758
F1 score : 0.9858990793613798
# Positive class: Spam
Total population : 5000
Condition positive : 673
Condition negative : 4327
True positive : 649
True negative : 4230
False positive : 97
False negative : 24
Prevalence : 0.1346
Specificity : 0.9775826207534088
Recall : 0.9643387815750372
Precision : 0.8699731903485255
Accuracy : 0.9758
F1 score : 0.9147286821705426
--------------------------------------------------------------------------------
================= bayes_classifier_3_fold_cross_validate_redis =================
------------------ Stats ------------------
Run Total Correct Incorrect Accuracy
-------------------------------------------
1 1666 1630 36 0.97839
2 1666 1622 44 0.97359
3 1666 1611 55 0.96699
-------------------------------------------
All 4998 4863 135 0.97299
----------------------- Confusion Matrix -----------------------
Predicted -> Ham Spam Total Recall
----------------------------------------------------------------
Ham 4212 113 4325 0.97387
Spam 22 651 673 0.96731
----------------------------------------------------------------
Total 4234 764 4998
Precision 0.9948 0.85209
# Positive class: Ham
Total population : 4998
Condition positive : 4325
Condition negative : 673
True positive : 4212
True negative : 651
False positive : 22
False negative : 113
Prevalence : 0.8653461384553821
Specificity : 0.9673105497771174
Recall : 0.9738728323699422
Precision : 0.9948039678790742
Accuracy : 0.9729891956782714
F1 score : 0.9842271293375394
# Positive class: Spam
Total population : 4998
Condition positive : 673
Condition negative : 4325
True positive : 651
True negative : 4212
False positive : 113
False negative : 22
Prevalence : 0.13465386154461784
Specificity : 0.9738728323699422
Recall : 0.9673105497771174
Precision : 0.8520942408376964
Accuracy : 0.9729891956782714
F1 score : 0.906054279749478
--------------------------------------------------------------------------------
Finished in 24.51805s
I feel it is quite full-featured now. We still need some unit tests for individual methods of the module, RDoc, and user documentation, but those can be handled in a separate PR. @Ch4s3 please feel free to merge it.
@marciovicente, Could you please have a look at the reports in the last message and see if anything important is missing or wrong?
@Ch4s3 I consider this one done from my side. I have added exhaustive user documentation (#145), hence RDoc is less important for this one, though we can add that in a separate PR. Unit tests will also be added separately as this has already become a big pile of commits and file changes.
@ibnesayeed It's a nice report! Seems like a Weka output 👏 Looks awesome to me! ✅
It is still a work in progress as per #71...