Flaky test - Githubissues

donghwan-shin commented 2 years ago

I found a flaky test, test_update_archive_diverse_best_random_testInput1, that sometime passes and sometimes fails, randomly.

Below you can find the failed execution log:

/Users/donghwan.shin/Repositories/MLCSHE/venv/bin/python /Applications/PyCharm.app/Contents/plugins/python/helpers/pycharm/_jb_pytest_runner.py --target test_update_archive_diverse_best_random.py::TestUpdateArchiveDiverseBestRandom.test_update_archive_diverse_best_random_testInput1
Testing started at 3:24 PM ...
Launching pytest with arguments test_update_archive_diverse_best_random.py::TestUpdateArchiveDiverseBestRandom::test_update_archive_diverse_best_random_testInput1 --no-header --no-summary -q in /Users/donghwan.shin/Repositories/MLCSHE/tests

============================= test session starts ==============================
collecting ... collected 1 item

test_update_archive_diverse_best_random.py::TestUpdateArchiveDiverseBestRandom::test_update_archive_diverse_best_random_testInput1 FAILED [100%]
test_update_archive_diverse_best_random.py:8 (TestUpdateArchiveDiverseBestRandom.test_update_archive_diverse_best_random_testInput1)
self = <tests.test_update_archive_diverse_best_random.TestUpdateArchiveDiverseBestRandom testMethod=test_update_archive_diverse_best_random_testInput1>

    def test_update_archive_diverse_best_random_testInput1(self):
        # make a solver instance
        solver = ICCEA(
            creator=problem.creator,
            toolbox=problem.toolbox,
            enumLimits=problem.enumLimits
        )

        # prepare test input
        scen1 = creator.Scenario([1, False, 5.0])
        scen1.fitness.values = (10.0,)
        mlco1 = creator.OutputMLC([[8, 'a'], [2, 'b']])
        mlco1.fitness.values = (10.0,)
        scen2 = creator.Scenario([4, True, -7.8])
        scen2.fitness.values = (8.5,)
        scen3 = creator.Scenario([-2, False, 4.87])
        scen3.fitness.values = (random.randint(-10, 8),)
        mlco2 = creator.OutputMLC([[1, 'a'], [21, 'd']])
        mlco2.fitness.values = (8.5,)
        mlco3 = creator.OutputMLC([[-2, 'e'], [10, 'f']])
        mlco3.fitness.values = (random.randint(-10, 8),)
        scen4 = creator.Scenario([2, True, 0.24])
        scen4.fitness.values = (random.randint(-10, 8),)
        mlco4 = creator.OutputMLC([[4, 'g'], [-1, 'h']])
        mlco4.fitness.values = (random.randint(-10, 8),)
        pScen = [scen1, scen2, scen3, scen4]
        pMLCO = [mlco1, mlco2, mlco3, mlco4]

        max_archive_size = 1
        min_distance = 0.5

        output_archive_pScen_1 = solver.update_archive_diverse_best_random(
            pScen, max_archive_size, min_distance
        )

        output_archive_pMLCO_1 = solver.update_archive_diverse_best_random(
            pMLCO, max_archive_size, min_distance
        )

        self.assertEqual(output_archive_pScen_1, [scen1])
        self.assertEqual(len(output_archive_pScen_1), 1)

        self.assertEqual(output_archive_pMLCO_1, [mlco1])
        self.assertEqual(len(output_archive_pMLCO_1), 1)

        max_archive_size = 2

        output_archive_pScen_2 = solver.update_archive_diverse_best_random(
            pScen, max_archive_size, min_distance
        )

        output_archive_pMLCO_2 = solver.update_archive_diverse_best_random(
            pMLCO, max_archive_size, min_distance
        )

>       self.assertIn(output_archive_pScen_2[1], [scen2, scen3, scen4])
E       IndexError: list index out of range

test_update_archive_diverse_best_random.py:64: IndexError

============================== 1 failed in 0.15s ===============================

Process finished with exit code 1

I think this is mainly because, though max_archive_size = 2, the length of an archive can be less than 2 based on your code.

Also, it would be nice if you can provide a comment for each test assertion so that anyone can better understand what's the test objective.

donghwan-shin commented 2 years ago

the same happened for test_update_archive_best_random_testInput1 as follows:

/Users/donghwan.shin/Repositories/MLCSHE/venv/bin/python /Applications/PyCharm.app/Contents/plugins/python/helpers/pycharm/_jb_pytest_runner.py --path /Users/donghwan.shin/Repositories/MLCSHE/tests/test_update_archive_best_random.py
Testing started at 4:50 PM ...
Launching pytest with arguments /Users/donghwan.shin/Repositories/MLCSHE/tests/test_update_archive_best_random.py --no-header --no-summary -q in /Users/donghwan.shin/Repositories/MLCSHE/tests

============================= test session starts ==============================
collecting ... collected 1 item

test_update_archive_best_random.py::TestUpdateArchiveBestRandom::test_update_archive_best_random_testInput1 FAILED [100%]
test_update_archive_best_random.py:8 (TestUpdateArchiveBestRandom.test_update_archive_best_random_testInput1)
self = <tests.test_update_archive_best_random.TestUpdateArchiveBestRandom testMethod=test_update_archive_best_random_testInput1>

    def test_update_archive_best_random_testInput1(self):
        # make a solver instance
        solver = ICCEA(
            creator=problem.creator,
            toolbox=problem.toolbox,
            enumLimits=problem.enumLimits
        )

        # prepare test input
        scen1 = creator.Scenario([1, False, 5.0])
        scen1.fitness.values = (10.0,)
        mlco1 = creator.OutputMLC([[8, 'a'], [2, 'b']])
        mlco1.fitness.values = (10.0,)
        scen2 = creator.Scenario([4, True, -7.8])
        scen2.fitness.values = (8.5,)
        scen3 = creator.Scenario([-2, False, 4.87])
        scen3.fitness.values = (random.randint(-10, 8),)
        mlco2 = creator.OutputMLC([[1, 'a'], [21, 'd']])
        mlco2.fitness.values = (8.5,)
        mlco3 = creator.OutputMLC([[-2, 'e'], [10, 'f']])
        mlco3.fitness.values = (random.randint(-10, 8),)
        scen4 = creator.Scenario([2, True, 0.24])
        scen4.fitness.values = (random.randint(-10, 8),)
        mlco4 = creator.OutputMLC([[4, 'g'], [-1, 'h']])
        mlco4.fitness.values = (random.randint(-10, 8),)
        pScen = [scen1, scen2, scen3, scen4]
        pMLCO = [mlco1, mlco2, mlco3, mlco4]

        max_archive_size = 1

        output_archive_pScen_1 = solver.update_archive_best_random(
            pScen, max_archive_size
        )

        output_archive_pMLCO_1 = solver.update_archive_best_random(
            pMLCO, max_archive_size
        )

        self.assertEqual(output_archive_pScen_1, [scen1])
        self.assertEqual(len(output_archive_pScen_1), 1)

        self.assertEqual(output_archive_pMLCO_1, [mlco1])
        self.assertEqual(len(output_archive_pMLCO_1), 1)

        max_archive_size = 2

        output_archive_pScen_2 = solver.update_archive_best_random(
            pScen, max_archive_size
        )

        output_archive_pMLCO_2 = solver.update_archive_best_random(
            pMLCO, max_archive_size
        )

>       self.assertIn(output_archive_pScen_2[1], [scen2, scen3, scen4])
E       AssertionError: [1, False, 5.0] not found in [[4, True, -7.8], [-2, False, 4.87], [2, True, 0.24]]

test_update_archive_best_random.py:63: AssertionError

============================== 1 failed in 0.15s ===============================

Process finished with exit code 1

In this case, the problem is that the following code (from update_archive_best_random()) can return archive_p having the same candidate twice (i.e., one from population_copy_sorted and another from population_copy):

        archive_p = []

        for i in range(archive_size):
            if len(archive_p) > 0:
                candidate = population_copy.pop(
                    random.randint(0, len(population_copy)-1))

            else:
                # Select the best indiviudal as the first candidate
                candidate = population_copy_sorted.pop(-1)

            archive_p.append(candidate)

        return archive_p

SepShr commented 2 years ago

I have addressed the raised issues in the following commit: https://github.com/SepShr/MLCSHE/commit/ad25cb3174aff3a53a6914ef05fff641f6cdd937

Could you please let me know if the changes fix the issue in your view so that I can close this issue?

Thanks!

donghwan-shin commented 2 years ago

Looks good, I do not see the flakiness in the tests anymore :-)

SepShr / MLCSHE

Flaky test #28