github-linguist / linguist

Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
MIT License
12.11k stars 4.19k forks source link

Bayesian classifier #439

Closed ruv closed 11 years ago

ruv commented 11 years ago

Now the classifier is not Bayesian in real. (Where did you find Math.log in Bayesian? ;)

And more, it seems to be incorrect:

tokens_probability function must "Returns Float between 0.0 and 1.0" (by doc) — but it returns negative numbers. Why?

Implementation of language_probability function seems incorrect in the root.

See also issue #437

frankshearar commented 11 years ago

The classifier works in log probabilities.

ruv commented 11 years ago

Thank you. Sorry for my ignorance =) I think we have to correct the comment for tokens_probability function.