next-exp / IC

6 stars 70 forks source link

Spurious test failures #551

Open jacg opened 5 years ago

jacg commented 5 years ago

Every now and then our Travis builds (or our local tests) fail spuriously. This sometimes wastes a lot of time and causes confusion. It's difficult to understand why these spurious failures occur, because it happens rarely, across different tests, and usually while we are in a hurry to do something else.

In order to gather information about these failures, so that we might be able to eliminate them, please add a comment to this issue every time you encounter such a spurious test failure.

Please mention the name of the test that failed, and information about exactly how it failed. If it happened in a Travis build, then provide a link to the build. If it happend on your local machine, then copy paste a relevant portion of the traceback.

Also, every time you add a failure, please edit this list, so that we keep a summary of the failures we have seen.

jacg commented 5 years ago

test_mean_for_pmts_fee_is_unbiased

https://travis-ci.org/nextic/IC/jobs/443937686

jacg commented 5 years ago

Hmm, restarting the job makes the original build which contains the failure, disappear. Does anybody know how whether Travis keeps the original build somewhere?

If not, we'll need a copy-paste rather than a link.

jacg commented 5 years ago

=================================== FAILURES =================================== ____ test_inverse_cdf_hypothesisgenerated ____ [gw3] linux -- Python 3.7.0 /home/travis/miniconda/envs/IC-3.7-2018-08-29/bin/python

@given(valid_distributions(),
     floats(min_value = 0.1, max_value = 0.9))

def test_inverse_cdf_hypothesis_generated(distribution, percentile):

invisible_cities/core/random_sampling_test.py:140:


../../../miniconda/envs/IC-3.7-2018-08-29/lib/python3.7/site-packages/hypothesis/core.py:587: in execute result = self.test_runner(data, run) ../../../miniconda/envs/IC-3.7-2018-08-29/lib/python3.7/site-packages/hypothesis/executors.py:58: in default_new_style_executor return function(data) ../../../miniconda/envs/IC-3.7-2018-08-29/lib/python3.7/site-packages/hypothesis/core.py:578: in run return test(*args, *kwargs) invisible_cities/core/random_sampling_test.py:140: in test_inverse_cdf_hypothesis_generated floats(min_value = 0.1, max_value = 0.9)) ../../../miniconda/envs/IC-3.7-2018-08-29/lib/python3.7/site-packages/hypothesis/core.py:525: in test result = self.test(args, **kwargs)


distribution = ([-2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, ...], array([1.4552137e-07... 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07], dtype=float32)) percentile = 0.8999999761581423

@given(valid_distributions(),
       floats(min_value = 0.1, max_value = 0.9))
def test_inverse_cdf_hypothesis_generated(distribution, percentile):
    domain, freq = distribution
    cdf = cdf_from_freq(freq)
    for i, (d, cp) in enumerate(zip(domain, cdf)):
        if cp >= percentile:
            true_value = d
            break
    icdf = inverse_cdf(domain, cdf, percentile)
  assert true_value == icdf

E assert 0.0 == -2.220446e-14

invisible_cities/core/random_sampling_test.py:149: AssertionError ---------------------------------- Hypothesis ---------------------------------- Falsifying example: test_inverse_cdf_hypothesis_generated(distribution=([-2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, -2.220446e-14, 0.0], array([1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07, 1.4552137e-07], dtype=float32)), percentile=0.8999999761581423)

jjgomezcadenas commented 5 years ago

@gonzaponte why the assert?? It should be a numpy_equal or similar, with a tolerance, notice the failure, 0 not euqual to 10***-14!!! This should be easy to fix

gonzaponte commented 5 years ago

why the assert??

I guess you mean why the ==

It should be a numpy_equal or similar, with a tolerance, notice the failure, 0 not euqual to 10***-14!!!

This is an unfortunate copy&paste from a similar test, I will fix it.

This should be easy to fix

It is!

mmkekic commented 5 years ago

=================================== FAILURES ===================================

____ test_inverse_cdf_hypothesisgenerated ____ [gw3] linux -- Python 3.7.0 /home/travis/miniconda/envs/IC-3.7-2018-10-20/bin/python @given(valid_distributions(),

     floats(min_value = 0.1, max_value = 0.9))

def test_inverse_cdf_hypothesis_generated(distribution, percentile): invisible_cities/core/random_sampling_test.py:141:


../../../miniconda/envs/IC-3.7-2018-10-20/lib/python3.7/site-packages/hypothesis/core.py:610: in execute result = self.test_runner(data, run) ../../../miniconda/envs/IC-3.7-2018-10-20/lib/python3.7/site-packages/hypothesis/executors.py:58: in default_new_style_executor return function(data) ../../../miniconda/envs/IC-3.7-2018-10-20/lib/python3.7/site-packages/hypothesis/core.py:601: in run return test(*args, *kwargs) invisible_cities/core/random_sampling_test.py:141: in test_inverse_cdf_hypothesis_generated floats(min_value = 0.1, max_value = 0.9)) ../../../miniconda/envs/IC-3.7-2018-10-20/lib/python3.7/site-packages/hypothesis/core.py:548: in test result = self.test(args, **kwargs)


distribution = ([-1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, ...], array([1.46089...-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07], dtype=float32)) percentile = 0.8999999761581423 @given(valid_distributions(), floats(min_value = 0.1, max_value = 0.9)) def test_inverse_cdf_hypothesis_generated(distribution, percentile): domain, freq = distribution cdf = cdf_from_freq(freq) for i, (d, cp) in enumerate(zip(domain, cdf)): if cp >= percentile: true_value = d break icdf = inverse_cdf(domain, cdf, percentile) assert icdf == approx(true_value) E assert -1.0214052e-12 == 0.0 ± 1.0e-12 E + where 0.0 ± 1.0e-12 = approx(0.0) invisible_cities/core/random_sampling_test.py:150: AssertionError ---------------------------------- Hypothesis ---------------------------------- Falsifying example: test_inverse_cdf_hypothesis_generated(distribution=([-1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, -1.0214052e-12, 0.0], array([1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07, 1.460898e-07], dtype=float32)), percentile=0.8999999761581423)

mmkekic commented 5 years ago

_ test_pmap_event_idselection [gw1] linux -- Python 3.7.0 /home/travis/miniconda/envs/IC-3.7-2018-10-20/bin/python @given(dictionaries(keys=integers(min_value=-1e5, max_value=1e5), values=pmaps(), max_size=5), lists(integers(min_value=-1e5, max_value=1e5)))

def test_pmap_event_id_selection(pmaps, events): E hypothesis.errors.FailedHealthCheck: Data generation is extremely slow: Only produced 4 valid examples in 1.17 seconds (2 invalid ones and 0 exceeded maximum size). Try decreasing size of the data you're generating (with e.g.max_size or max_leaves parameters). E See https://hypothesis.readthedocs.io/en/latest/healthchecks.html for more information about this. If you want to disable just this health check, add HealthCheck.too_slow to the suppress_health_check settings for this test. invisible_cities/reco/pmaps_functions_test.py:83: FailedHealthCheck ---------------------------------- Hypothesis ---------------------------------- You can add @seed(320050419280151807443131340231084161863) to this test or run pytest with --hypothesis-seed=320050419280151807443131340231084161863 to reproduce this failure.

gonzaponte commented 5 years ago
[gw2] linux -- Python 3.7.0 /home/travis/miniconda/envs/IC-3.7-2018-11-07/bin/python

dark_spectrum_local = ((array([-5.        , -4.8989899 , -4.7979798 , -4.6969697 , -4.5959596 ,
       -4.49494949, -4.39393939, -4.29292929....56286665e+01, 3.25217578e+01, 2.30167373e+01, 1.62137485e+01,
       1.14155084e+01, 8.08971455e+00, 5.83604597e+00]))

    def test_scaled_dark_pedestal_pedestal(dark_spectrum_local):
        (bins, nsamples, scale, poisson_mean,
         pedestal_mean, pedestal_sigma,
         gain, gain_sigma, min_integral), expected_spectrum = dark_spectrum_local
    
        xs       = shift_to_bin_centers(bins)
        pedestal = spe.binned_gaussian_spectrum(pedestal_mean, pedestal_sigma, nsamples, bins)
        f        = spe.scaled_dark_pedestal(pedestal,
                                            pedestal_mean, pedestal_sigma,
                                            min_integral)
        actual_spectrum = f(xs, scale, poisson_mean,
                            gain, gain_sigma)
    
        x0, s0    = pedestal_mean, pedestal_sigma
        selection = in_range(shift_to_bin_centers(bins),   x0 - 5 * s0,    x0 + 5 * s0)
        pull      = expected_spectrum[selection]   -  actual_spectrum[selection]
        pull     /= expected_spectrum[selection]**0.5
>       assert np.all(in_range(pull, -2.5, 2.5))
E       assert False
E        +  where False = <function all at 0x7f8f21448620>(array([ True,  True,  True,  True,  True, False,  True,  True,  True,\n        True,  True,  True,  True,  True,  True,... True,  True,  True,  True,  True,  True,  True,  True,\n        True,  True,  True,  True,  True,  True,  True,  True]))
E        +    where <function all at 0x7f8f21448620> = np.all
E        +    and   array([ True,  True,  True,  True,  True, False,  True,  True,  True,\n        True,  True,  True,  True,  True,  True,... True,  True,  True,  True,  True,  True,  True,  True,\n        True,  True,  True,  True,  True,  True,  True,  True]) = in_range(array([ 0.24724279, -0.63518977, -0.34419627,  0.51071258, -0.29183231,\n       -2.54954483, -0.18808316, -0.19375065, ...245982,  0.26678097, -0.7749531 ,  0.41749303, -0.18162616,\n       -0.73069211, -0.18234846, -1.00615231, -0.18408912]), -2.5, 2.5)
gonzaponte commented 5 years ago
____________________________ test_fill_kdst_var_1d _____________________________
[gw2] linux -- Python 3.7.0 /home/travis/miniconda/envs/IC-3.7-2018-11-07/bin/python

    @given(data_frames(columns=columns(kdst_variables, elements=floats(allow_nan=False))))
>   @settings(deadline=None)
    def test_fill_kdst_var_1d(kdst):
E   hypothesis.errors.FailedHealthCheck: Data generation is extremely slow: Only produced 7 valid examples in 1.05 seconds (0 invalid ones and 0 exceeded maximum size). Try decreasing size of the data you're generating (with e.g.max_size or max_leaves parameters).
E   See https://hypothesis.readthedocs.io/en/latest/healthchecks.html for more information about this. If you want to disable just this health check, add HealthCheck.too_slow to the suppress_health_check settings for this test.

invisible_cities/reco/monitor_functions_test.py:420: FailedHealthCheck