wikimedia / revscoring

A generic, machine learning-based revision scoring system for MediaWiki
https://revscoring.readthedocs.io
MIT License
89 stars 51 forks source link

Test for multinomial Naive Bayes fails because of negative features #65

Closed halfak closed 9 years ago

halfak commented 9 years ago

Write a test that does not include negative feature values.

he7d3r commented 9 years ago

This looks related to the following error:

(3.4) helder@std:~/projects/revscoring
$nosetests
.....................................................................E.F......
======================================================================
ERROR: revscoring.scorers.tests.test_nb.test_multinomial_nb
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/helder/env/3.4/lib/python3.4/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/helder/projects/revscoring/revscoring/scorers/tests/test_nb.py", line 11, in test_multinomial_nb
    train_score(model)
  File "/home/helder/projects/revscoring/revscoring/scorers/tests/util.py", line 39, in train_score
    model.train(train_set)
  File "/home/helder/projects/revscoring/revscoring/scorers/scorer.py", line 237, in train
    self.classifier_model.fit(values, labels)
  File "/home/helder/env/3.4/lib/python3.4/site-packages/sklearn/naive_bayes.py", line 324, in fit
    self._count(X, Y)
  File "/home/helder/env/3.4/lib/python3.4/site-packages/sklearn/naive_bayes.py", line 426, in _count
    raise ValueError("Input X must be non-negative")
ValueError: Input X must be non-negative

======================================================================
FAIL: revscoring.scorers.tests.test_scorer.test_scorer
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/helder/env/3.4/lib/python3.4/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/helder/projects/revscoring/revscoring/scorers/tests/test_scorer.py", line 27, in test_scorer
    eq_(score_doc['divide'], 3/5)
AssertionError: 1.6666666666666667 != 0.6

----------------------------------------------------------------------
Ran 78 tests in 2.797s

FAILED (errors=1, failures=1)