evanpurkhiser / CS-Karat-Sleuth

A simplistic spam heuristics tool written in the Ruby programming language – Fall 2013 AI
MIT License
0 stars 0 forks source link

Re-do final evaluation of results given even amounts of spam/ham #6

Closed hmm34 closed 10 years ago

hmm34 commented 10 years ago

Test the classifier using

Store results here, given the data sets and number of emails used.

hmm34 commented 10 years ago

Running the 50-50 spam/ham data set with 5K spam and 5K ham:

Emails Classified: 10000/10000

     E-mail Confusion Matrix

                                    Ham                                Spam    
         .                    ---------------                  ---------------
     Ham  |                   4996 (99.9%)                        4 (0.1%)
     Spam |                   1988 (39.8%)                    3012 (60.2%)
hmm34 commented 10 years ago

Running the 80-20 spam/ham data set with 5,333 spam and 1,333 ham:

Emails Classified: 6666/6666

     E-mail Confusion Matrix

                                  Ham                             Spam    
          .                ---------------                 ---------------
     Ham  |                   1332 (99.9%)                        1 (0.1%)
     Spam |                   2489 (46.7%)                    2844 (53.3%)
hmm34 commented 10 years ago

Both of these were using the unknown/spam and easy/ham.

hmm34 commented 10 years ago

Running the 90-10-hard data set with 5,333 spam and 468 hard/ham.

Emails Classified: 5801/5801

     E-mail Confusion Matrix

                                  Ham                             Spam    
          .                ---------------                 ---------------
     Ham  |                    370 (79.1%)                      98 (20.9%)
     Spam |                      95 (1.8%)                    5238 (98.2%)
hmm34 commented 10 years ago

Classifed, and closing.