Open fritzo opened 9 years ago
The lp
version is seen to clearly fail where the dbg
version passes, using the script
$ python derivations/vector_gof.py plot-cdf $ python derivations/vector_gof.py scatter
See plots below
Disabling the SkipTest, we see
$ nosetests -v distributions/tests/test_models.py:test_sample_value distributions.tests.test_models.test_sample_value('dbg.models.bb',) ... ok distributions.tests.test_models.test_sample_value('dbg.models.dpd',) ... ok distributions.tests.test_models.test_sample_value('dbg.models.bnb',) ... ok distributions.tests.test_models.test_sample_value('lp.models.dpd',) ... ok distributions.tests.test_models.test_sample_value('lp.models.bb',) ... ok distributions.tests.test_models.test_sample_value('lp.models.bnb',) ... ok distributions.tests.test_models.test_sample_value('dbg.models.nich',) ... ok distributions.tests.test_models.test_sample_value('hp.models.nich',) ... ok distributions.tests.test_models.test_sample_value('dbg.models.dd',) ... ok distributions.tests.test_models.test_sample_value('dbg.models.gp',) ... ok distributions.tests.test_models.test_sample_value('hp.models.gp',) ... ok distributions.tests.test_models.test_sample_value('hp.models.dd',) ... ok distributions.tests.test_models.test_sample_value('dbg.models.niw',) ... ok distributions.tests.test_models.test_sample_value('lp.models.nich',) ... ok distributions.tests.test_models.test_sample_value('lp.models.gp',) ... ok distributions.tests.test_models.test_sample_value('lp.models.dd',) ... ok distributions.tests.test_models.test_sample_value('lp.models.niw',) ... FAIL ====================================================================== FAIL: distributions.tests.test_models.test_sample_value('lp.models.niw',) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/fritz/.virtualenvs/posterior/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/fritz/posterior/distributions/distributions/tests/test_models.py", line 104, in test_one_model test_fun(module, EXAMPLE) File "/home/fritz/posterior/distributions/distributions/tests/test_models.py", line 402, in test_sample_value assert_greater(gof, MIN_GOODNESS_OF_FIT) AssertionError: 0.00018875054387771975 not greater than 0.001 -------------------- begin captured stdout --------------------- example 1/4 Prob Count 0.048 543 ------------------------------------------------------------ 0.048 539 ------------------------------------------------------------ 0.048 527 ---------------------------------------------------------- 0.048 511 -------------------------------------------------------- 0.048 504 -------------------------------------------------------- 0.048 495 ------------------------------------------------------- 0.048 487 ------------------------------------------------------ 0.048 485 ------------------------------------------------------ 0.048 472 ---------------------------------------------------- 0.048 470 ---------------------------------------------------- 0.048 468 ---------------------------------------------------- 0.048 465 --------------------------------------------------- 0.048 462 --------------------------------------------------- 0.048 462 --------------------------------------------------- 0.048 459 --------------------------------------------------- 0.048 455 -------------------------------------------------- 0.048 454 -------------------------------------------------- 0.048 451 -------------------------------------------------- 0.048 449 -------------------------------------------------- 0.048 424 ----------------------------------------------- 0.048 418 ---------------------------------------------- distributions.lp.models.niw gof = 0.000189 --------------------- end captured stdout ---------------------- ---------------------------------------------------------------------- Ran 17 tests in 31.970s FAILED (failures=1)
@stephentu FYI
$ python derivations/vector_gof.py plot-cdf
$ python derivations/vector_gof.py scatter
(reds and blues should be uniformly distributed)
The new stronger statistical tests indicate a bug in the C++ version of Normal-Inverse-Wishart sampler. I assume the bug is in the sampling code in random.hpp (python sampler and scorer agree as per
test_models.py:test_sample_value
, and python and c++ scoring agree as pertest_model_flavors.py:test_group
, pointing to C++ sampling as the culprit.This is now disabled in unit tests.