mahmoud / lithoxyl

Application instrumentation and logging, with a geological bent.
146 stars 10 forks source link

Test lithoxyl/tests/test_stats.py::test_momentacc_norm is dependent on the random seed #11

Open sturmianseq opened 3 years ago

sturmianseq commented 3 years ago

If adding random.seed(0) to the beginning of test lithoxyl/tests/test_stats.py::test_momentacc_norm, and running pytest lithoxyl/tests/test_stats.py::test_momentacc_norm, you will see this error message:

____________________________________________________________________ test_momentacc_norm ____________________________________________________________________

    def test_momentacc_norm():
        random.seed(0)
        ma = MomentAccumulator()
        for v in [random.gauss(10, 4) for i in range(5000)]:
            ma.add(v)
        _assert_round_cmp(10, abs(ma.mean), mag=1)
        _assert_round_cmp(4, ma.std_dev, mag=1)
>       _assert_round_cmp(0, ma.skewness, mag=1)

lithoxyl/tests/test_stats.py:49: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

a = 0, b = 0.07368173837804089, mag = 1, name = None

    def _assert_round_cmp(a, b, mag=3, name=None):
        thresh = 1.0 / (10 ** mag)
        abs_diff = round(abs(a - b), mag + 1)
        tmpl = 'round-compare failed at %d digits (%f - %f = %f > %f)'
        rel_diff = (2 * abs_diff) / (a + b)
        err_msg = tmpl % (mag, a, b, abs_diff, thresh)
        if name:
            err_msg = '%r %s' % (name, err_msg)
>       assert rel_diff < thresh, err_msg
E       AssertionError: round-compare failed at 1 digits (0.000000 - 0.073682 = 0.070000 > 0.100000)
E       assert 1.9000637482478797 < 0.1

IMHO, the current way to assert that the skewness is close to 0 is not reasonable, because it in fact asserts that 2*round(skewness,2)<0.1*skewness, which is only true when skewness<0.005 and round(skewness,2)=0. However, the skewness of 5000 random samples can often be greater than 0.005.

sturmianseq commented 3 years ago

Test lithoxyl/tests/test_stats.py::test_acc_random is also dependent on the random seed. It passes when executing the test class together (i.e., running pytest lithoxyl/tests/test_stats.py), but fails when running in isolation (i.e., running pytest lithoxyl/tests/test_stats.py::test_acc_random) with the following failure message:

============================================================================== FAILURES ===============================================================================
___________________________________________________________________________ test_acc_random ___________________________________________________________________________

    def test_acc_random():
        data = test_sets['random.random 0.0-1.0']

        qa = ReservoirAccumulator(data)
        capqa = ReservoirAccumulator(data, cap=True)
        p2qa = P2Accumulator(data)
        for acc in (qa, capqa, p2qa):
            for qp, v in acc.get_quantiles():
                if qp > 0:
>                   assert 0.95 < (v / qp) < 1.05
E                   assert (0.010605121375992893 / 0.01) < 1.05

lithoxyl/tests/test_stats.py:106: AssertionError