Improve reliability of test_discard by resetting the pre-state

DanielSank / observed

Observer pattern in python

MIT License

33 stars 4 forks source link

Improve reliability of test_discard by resetting the pre-state #24

Closed sturmianseq closed 3 years ago

sturmianseq commented 3 years ago

This PR aims to improve the reliability of the test TestBasics.test_discard by resetting the pre-state.

The test can fail if self.buf is in a polluted state before running the test:

>       assert a.buf == ['abar']
E       AssertionError: assert ['abar', 'abar'] == ['abar']
E         Left contains one more item: 'abar'
E         Use -v to get the full diff

observed_test.py:291: AssertionError

DanielSank commented 3 years ago

How did you get into a situation where self.buf wasn't empty? The setup_class method should be setting it to [] before every test.

sturmianseq commented 3 years ago

Actually setup_class runs only once before the test class (rather than before every test function). This test has two potential issues: (1) The test test_discard can fail on the second run when run twice, e.g., with pip install pytest-repeat; pytest --count=2 observed_test.py::TestBasics::test_discard. So the test pollutes itself, and adding self.buf = [] can fix this issue. (2) The test test_callbacks can fail when run after test_discard, e.g., with pip install pytest-random-order; pytest --random-order observed_test.py::TestBasics -k "discard or callbacks".

These issues can also be fixed by changing the setup_class into setup_method so that setup_method is run before every test function. Let me know if you want me to update PR for this fix. Thanks!

DanielSank commented 3 years ago

Ah yes, please either use setup_method or just get ride of this whole idea of using self.buf and make it a local variable in each test instead. The only reason I used self. was to save some typing, which is kinda silly.

sturmianseq commented 3 years ago

Thanks for your comments! I have updated the PR. Two issues mentioned above are now fixed with setup_method.

DanielSank commented 3 years ago

Thanks. This looks fine but I'm still puzzled but what brought us here. I just cloned the repo and ran pytest observed_test.py and found all tests passing. Can you provided an example of the tests failing?

sturmianseq commented 3 years ago

To be honest, I am conducting a rigorous and interesting research project on software testing. More specifically, we ran the tests twice to see whether the test would fail on the second run. A non-idempotent test (namely PASS,FAIL when running twice) can indicate that there is state pollution in your test suite, which may further lead to test order dependence. Order-dependent tests are a main category of flaky tests, an emergent topic in research.

pip install pytest-repeat; pytest --count=2 observed_test.py::TestBasics::test_discard is the command to reproduce the failure that we have found.

pip install pytest-random-order; pytest --random-order observed_test.py::TestBasics -k "discard or callbacks" can fail when running multiple times, which indicates that test_callbacks and test_discard are order-dependent tests.

DanielSank commented 3 years ago

Very nice. Thanks for the explanation.