itdxer / naive-bayes

Naive Bayes Text Classifier
MIT License
12 stars 4 forks source link

Naive Bayes Text Classifier

Text classifier based on Naive Bayes.

Instalation

$ pip install naive-bayes

Usage example

from naivebayes import NaiveBayesTextClassifier

classifier = NaiveBayesTextClassifier(
    categories=categories_list,
    stop_words=stopwords_list
)
classifier.train(train_docs, train_classes)
predicted_classes = classifier.classify(test_docs)

NaiveBayesTextClassifier is a simple wrapper around scikit-learn class CountVectorizer. You can put all arguments which support this class. For more information please check scikit-learn official documentation.

More examples

Check examples at examples folder. Before run them, install requirements in this folder.

Clone repository from github

$ git clone git@github.com:itdxer/naive-bayes.git
$ cd naive-bayes/examples
$ pip install -r requirements.txt

And run some example

Usenet 20 newsgroup

$ python 20newsgroup

Kaggle IMDB reviews competition

$ python imdb_reviews