MichaelAquilina / SpamFilter

Classification of emails using machine learning and natural language processing techniques in Java
5 stars 4 forks source link

Re-add Stopword Removal #40

Closed MichaelAquilina closed 10 years ago

MichaelAquilina commented 10 years ago

From analysing the data I am noticing there are a lot of known stopwords included with the selected features. If time permits, we should re-introduce stopword removal during the text processing phase.

xhochy commented 10 years ago

NO NO NO

This will mess up all my Naive Bayes feature selection

MichaelAquilina commented 10 years ago

Lol dw I understand! I just put this issue up just to make myself aware that the issue exists. Wasnt going to implement it with the time constraints dw :)