Re-do final evaluation of results given even amounts of spam/ham

evanpurkhiser / CS-Karat-Sleuth

A simplistic spam heuristics tool written in the Ruby programming language – Fall 2013 AI

MIT License

0 stars 0 forks source link

Re-do final evaluation of results given even amounts of spam/ham #6

Closed hmm34 closed 10 years ago

hmm34 commented 10 years ago

Test the classifier using

50% ham and 50% spam training and testing data, per suggestion
80% spam and 20% ham training and testing data, per likelihood

Store results here, given the data sets and number of emails used.

hmm34 commented 10 years ago

Running the 50-50 spam/ham data set with 5K spam and 5K ham:

Emails Classified: 10000/10000

     E-mail Confusion Matrix

                                    Ham                                Spam    
         .                    ---------------                  ---------------
     Ham  |                   4996 (99.9%)                        4 (0.1%)
     Spam |                   1988 (39.8%)                    3012 (60.2%)

hmm34 commented 10 years ago

Running the 80-20 spam/ham data set with 5,333 spam and 1,333 ham:

Emails Classified: 6666/6666

     E-mail Confusion Matrix

                                  Ham                             Spam    
          .                ---------------                 ---------------
     Ham  |                   1332 (99.9%)                        1 (0.1%)
     Spam |                   2489 (46.7%)                    2844 (53.3%)

hmm34 commented 10 years ago

Both of these were using the unknown/spam and easy/ham.

hmm34 commented 10 years ago

Running the 90-10-hard data set with 5,333 spam and 468 hard/ham.

Emails Classified: 5801/5801

     E-mail Confusion Matrix

                                  Ham                             Spam    
          .                ---------------                 ---------------
     Ham  |                    370 (79.1%)                      98 (20.9%)
     Spam |                      95 (1.8%)                    5238 (98.2%)

hmm34 commented 10 years ago

Classifed, and closing.