jbrukh / bayesian

Naive Bayesian Classification for Golang.
Other
799 stars 128 forks source link

Prior probability includes word frequencies? #18

Open muety opened 7 years ago

muety commented 7 years ago

This is rather a question than an actual issue, but anyway.

First, did I get it right that the prior probability P(C_j) of a class is the number of document within that class, divided by the total number of documents?

And if so, why does the getPriors() function set the prior prob. of a class C to the number of words in documents of that class (classData.Total) divided by the total number of words? I'd expect that for the prior prob, words don't play any role, yet.

Probably I have a problem in understanding, so please try to enlight me.