This repository contains a Naive Bayes classifier implemented on document classification which is completed on CSCI 8360, Data Science Practicum at the University of Georgia, Spring 2018.
A value that represents the total distinct words used in the whole training data (doesn't matter the labels or documents). It would be best to sc.broadcast() it then we don't have to input it every time.
A value that represents the total distinct words used in the whole training data (doesn't matter the labels or documents). It would be best to
sc.broadcast()
it then we don't have to input it every time.