caporaso-lab / genome-sampler

https://caporasolab.us/genome-sampler/
BSD 3-Clause "New" or "Revised" License
5 stars 10 forks source link

Stochastic test failure in TestSubsampleNeighbors.test_sample_cluster_missing_locales #80

Closed thermokarst closed 4 years ago

thermokarst commented 4 years ago
=================================== FAILURES ===================================
__________ TestSubsampleNeighbors.test_sample_cluster_missing_locales __________

self = <genome_sampler.tests.test_sample_neighbors.TestSubsampleNeighbors testMethod=test_sample_cluster_missing_locales>

    def test_sample_cluster_missing_locales(self):
        columns = ['context_id', 'n_mismatches', 'locale']
        cluster = pd.DataFrame([['c4', 5, 'abc'],
                                ['c2', 0, float('nan')],
                                ['c99', 1, float('nan')],
                                ['c42', 2, 'abc']],
                               columns=columns)

        count_obs_c4 = 0
        count_obs_c2 = 0
        count_obs_c99 = 0
        count_obs_c42 = 0

        for _ in range(self._N_TEST_ITERATIONS):
            obs = _sample_cluster(cluster, 3, np.random.RandomState())
            self.assertEqual(len(obs), 3)
            if 'c4' in obs:
                count_obs_c4 += 1
            if 'c2' in obs:
                count_obs_c2 += 1
            if 'c99' in obs:
                count_obs_c99 += 1
            if 'c42' in obs:
                count_obs_c42 += 1

        # c4 and c42 all have locale "abc" and c99 and c2 have unknown locale,
        # so we expect to see c99 amd c2 more frequently
        self.assertTrue(count_obs_c99 > count_obs_c4)
        self.assertTrue(count_obs_c99 > count_obs_c42)
>       self.assertTrue(count_obs_c2 > count_obs_c4)
E       AssertionError: False is not true
thermokarst commented 4 years ago

I ran into this again.

thermokarst commented 4 years ago

This has actually happened 5 or 6 times today during development activities, I think this test is a bit more fragile than I had initially realized.

thermokarst commented 4 years ago

And again, but this time:

self.assertTrue(count_obs_c99 > count_obs_c4)