cta-observatory / ctapipe

Low-level data processing pipeline software for CTAO or similar arrays of Imaging Atmospheric Cherenkov Telescopes
https://ctapipe.readthedocs.org
BSD 3-Clause "New" or "Revised" License
64 stars 268 forks source link

test_toy/test_intensity[None] fails on macOS #1809

Closed kosack closed 2 years ago

kosack commented 2 years ago

The following line seems to occasionally lead to a random test failure:

https://github.com/cta-observatory/ctapipe/blob/22655d189d6fe1330e01891ff12296b621313bdf/ctapipe/image/tests/test_toy.py#L44-L45

E       assert 177.0 <= 168
E        +  where 177.0 = <bound method rv_frozen.ppf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fdb48888490>>(0.05)
E        +    where <bound method rv_frozen.ppf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fdb48888490>> = <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fdb48888490>.ppf
E        +      where <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fdb48888490> = poisson(200)
E        +  and   168 = <built-in method sum of numpy.ndarray object at 0x7fdb203970f0>()
E        +    where <built-in method sum of numpy.ndarray object at 0x7fdb203970f0> = array([0, 0, 0, ..., 0, 0, 0]).sum

ctapipe/image/tests/test_toy.py:45: AssertionError
kosack commented 2 years ago

actually, it's not intermittent: seems to always fail on my machine, so probably related to a random seed.

kosack commented 2 years ago

Looking deeper pytest ctapipe/image -k toy, this test passes (at least on a few trials).

pytest ctapipe/image -k toy

platform darwin -- Python 3.8.8, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- /Users/kkosack/miniconda3/envs/cta-0.10.5/bin/python
cachedir: .pytest_cache
rootdir: /Users/kkosack/Projects/CTA/Working/ctapipe, configfile: setup.cfg
plugins: cov-2.11.1, xdist-2.2.1, forked-1.3.0
collected 102 items / 95 deselected / 7 selected

ctapipe/image/tests/test_hillas.py::test_with_toy PASSED                                                    [ 14%]
ctapipe/image/tests/test_toy.py::test_intensity[None] PASSED                                                [ 28%]
ctapipe/image/tests/test_toy.py::test_intensity[0] PASSED                                                   [ 42%]
ctapipe/image/tests/test_toy.py::test_skewed PASSED                                                         [ 57%]
ctapipe/image/tests/test_toy.py::test_compare PASSED                                                        [ 71%]
ctapipe/image/tests/test_toy.py::test_obtain_time_image PASSED                                              [ 85%]
ctapipe/image/tests/test_toy.py::test_waveform_model PASSED                                                 [100%]

When running all tests, it fails every time, with the same value (as if the same random seed is being used with the "None" option, but it is affected by a previous use of the RandomState). This

From the NumPy Docs:

If seed is None, then RandomState will try to read data from /dev/urandom (or the Windows analogue) if available or seed from the clock otherwise.

Perhaps /dev/urandom always gives the same sequence?

pytest --exitfirst

...
ctapipe/image/tests/test_timing_parameters.py::test_psi_0 PASSED                                            [ 31%]
ctapipe/image/tests/test_timing_parameters.py::test_psi_20 PASSED                                           [ 32%]
ctapipe/image/tests/test_timing_parameters.py::test_ignore_negative PASSED                                  [ 32%]
ctapipe/image/tests/test_toy.py::test_intensity[None] FAILED                                                [ 32%]

==================================================== FAILURES =====================================================
______________________________________________ test_intensity[None] _______________________________________________

seed = None

    @pytest.mark.parametrize("seed", [None, 0])
    def test_intensity(seed):
        from ctapipe.image.toymodel import Gaussian

        geom = CameraGeometry.from_name("LSTCam")

        x, y = u.Quantity([0.2, 0.3], u.m)
        width = 0.05 * u.m
        length = 0.15 * u.m
        intensity = 200
        psi = "30d"

        # make a toymodel shower model
        model = Gaussian(x=x, y=y, width=width, length=length, psi=psi)

        if seed is None:
            _, signal, _ = model.generate_image(geom, intensity=intensity, nsb_level_pe=5)
        else:
            rng = np.random.default_rng(seed)
            _, signal, _ = model.generate_image(
                geom, intensity=intensity, nsb_level_pe=5, rng=rng
            )

        # test if signal reproduces given cog values
        assert np.average(geom.pix_x.to_value(u.m), weights=signal) == approx(0.2, rel=0.15)
        assert np.average(geom.pix_y.to_value(u.m), weights=signal) == approx(0.3, rel=0.15)

        # test if signal reproduces given width/length values
        cov = np.cov(geom.pix_x.value, geom.pix_y.value, aweights=signal)
        eigvals, _ = np.linalg.eigh(cov)

        assert np.sqrt(eigvals[0]) == approx(width.to_value(u.m), rel=0.15)
        assert np.sqrt(eigvals[1]) == approx(length.to_value(u.m), rel=0.15)

        # test if total intensity is inside in 99 percent confidence interval
>       assert poisson(intensity).ppf(0.05) <= signal.sum() <= poisson(intensity).ppf(0.95)
E       assert 177.0 <= 168
E        +  where 177.0 = <bound method rv_frozen.ppf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fd382358070>>(0.05)
E        +    where <bound method rv_frozen.ppf of <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fd382358070>> = <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fd382358070>.ppf
E        +      where <scipy.stats._distn_infrastructure.rv_frozen object at 0x7fd382358070> = poisson(200)
E        +  and   168 = <built-in method sum of numpy.ndarray object at 0x7fd391cad3f0>()
E        +    where <built-in method sum of numpy.ndarray object at 0x7fd391cad3f0> = array([0, 0, 0, ..., 0, 0, 0]).sum

ctapipe/image/tests/test_toy.py:45: AssertionError

Possible solution:

just remove "None" as a seed from the test, and give two explicit seeds (e.g. 0,1).

Question: is the same seed guaranteed to produce the same random sequence on different architectures/OSes?

maxnoe commented 2 years ago

@kosack /dev/urandom is the device for cryptographically secure random numbers, so no, this should not be reproducible.

The numpy random generators should behave the same on different systems.

I would just remove the test without setting the random seed or set the seed of the toymodel.TOY_RNG before the test