Allow tests to be run more than once

jml commented 8 years ago

Our setUp and tearDown protection would prevent a single test from being run more than once. This makes a RunTest-based solution to #1515933 - Support for retrying intermittently failing tests rather difficult.

This patch makes it possible to re-run tests, failed or otherwise. AFAICT, it doesn't change any of our public APIs.

thomir commented 8 years ago

This looks OK to me, but I wonder if it will introduce problems in other, related projects? My instinct is that subunit will be fine, but I don't know about testr & friends?

rbtcollins commented 8 years ago

Review status: 0 of 3 files reviewed at latest revision, 3 unresolved discussions.

^{testtools/testcase.py, line 643 [r1] (raw file):} Do we want this to work even in the case of tearDown failing? If so it should be behind a finally. I'm fine as-is, just asking ;)

^{testtools/tests/test_runtest.py, line 86 [r1] (raw file):} I'm a little concerned about this change. This test is testing _run_prepared_result, a private API. By changing this to check tearDown, its making more assumptions about _run_teardown. I see why you need to change it: I suggest overriding _run_teardown in the test class instead, since that is the contract RunTest interacts with. It doesn't itself know about tearDown.

^{testtools/tests/test_testcase.py, line 1269 [r1] (raw file):} I think you need a scenario in here: they should be able to be run twice even if the following things happen, no?:

{stage} x {final state}

where stage in {setup, body, teardown, cleanups} and final state in {pass, fail, error, skip, xfail}

(Obviously user code could completely break things like e.g. an external DB or something, but our framework should handle these cases).

Comments from the review on Reviewable.io

jml commented 8 years ago

Any external project that uses the attributes with double underscore knows to expect this sort of trouble
Enabling new functionality of running a test twice won't cause another project to break, I don't think

Review status: 0 of 3 files reviewed at latest revision, 3 unresolved discussions.

_testtools/testcase.py, line 643 [r1] (raw file):_ Yes we do. This is in support of retrying intermittently failing tests. Such tests are just as likely to fail in teardown due to complicated fixtures as they are during the tests themselves.

I've fixed this. Writing the test for tearDown requires more care than the regular test.

_testtools/tests/test_runtest.py, line 86 [r1] (raw file):_ Good point! Fixed. (How would you feel about having explicit interfaces/abcs in testtools? I think it's worth giving these contracts names.)

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Do we already have something for that?

I've been thinking for a while that we do need something like this for a lot of purposes (perhaps powered by Hypothesis). It could even go hand-in-hand with some kind of explicit documentation of interfaces.

Anyway, if we already have something, I'm happy to make this change. If not, I'd rather do it in a follow-up patch, but maybe can be persuaded to cobble something together. (I actually wrote a couple of extra tests for different paths but none of them triggered failures so I dropped them).

_Comments from the review on Reviewable.io_

jml commented 8 years ago

Thanks for the reviews!

rbtcollins commented 8 years ago

Reviewed 3 of 3 files at r1. Review status: all files reviewed at latest revision, 3 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ I think testscenarios is a good fit here, since its a simple product of cases.

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: all files reviewed at latest revision, 3 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ OK, that means we probably don't already have something, since we don't currently depend on testscenarios or use it in any way. Leaving this as future work.

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_runtest.py, line 86 [r1] (raw file):_ If they're lightweight I'm happy with it. Also PEP-484 annotations.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Well we depend on fixtures for testing, I see no reason we shouldn't depend on testscenarios too.

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_runtest.py, line 86 [r1] (raw file):_ Great!

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Happy w/ testscenarios—in a later patch.

My issue hasn't been about which technology we should use for doing this. It's that I don't want to predicate this patch on introducing a whole new class of testing strategy into the code-base, even though I think it'd be a great idea.

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Up to you on how you do it. Its going to be a nasty for loop with manual diagnostics, or 5 lines of extra diff (which I can write up for you if you like).

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ https://github.com/testing-cabal/testtools/pull/166

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Thanks. I'll go ahead and write those tests at some point in the future.

Why you think out of all the code that we have that would benefit from scenarios (I think there's lots), that this patch needs to be blocked on it?

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ I don't think it has to be blocked on testscenarios. I think it has to be blocked on appropriate coverage for your declared goal,or we'll be fighting little brushfires as folk find holes in the next few releases.

I think doing that coverage via scenarios is less total effort than doing it by hand, with clearer results, so was suggesting you take what I perceive to be the path of least resistance in achieving your declared go.

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Ah OK. We disagree about appropriate coverage.

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: all files reviewed at latest revision, 2 unresolved discussions.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Ah. So here's my reasoning. I found a genuine bug by code inspection, thus I know the tests you have were insufficient. I'm not confident there are no other bugs lurking...

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ Oops, forgot to push.

Before I even submitted this patch, I actually wrote two more tests almost exactly like test_runTwice, including one specifically written for the case of erroring in tearDown that looked a lot like test_runTwice. Neither failed, so I deleted them.

A naive parametrized version of test_runTwice would have found not found that bug.

Anyway, I'll write the tests.

These tests might drive a design change though. getDetails() accumulates over the course of many runs, and for truly repeatable runs it has to be cleared out. For compatibility with other tools, it would be good if that was done as part of the run() method.

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: 0 of 3 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.

testtools/testcase.py, line 644 [r2] (raw file):

class BreakSetup(TestCase):
    called = False

    def setUp(self):
        if self.called:
            return
        super(BreakSetup, self).setUp()

    def test_nil(self):
        pass

    def tearDown(self):
        if self.called:
            super(BreakSetup, self).tearDown()
        self.__class__.called = True

This simulates a test that fails to upcall in setup if some global state is borked, and fails to call tearDown at all. I believe if run twice with your patch the first time it will trigger the ValueError, and then the second time when it does not upcall in setUp, it won't trigger an error even though its now buggy. I've seen this sort of thing dealing with e.g. locked databases, so I think its important to be correct here. I suspect its in the pathological space where hypothesis would be a better fit than entire-space-checking, but this sort of stateful interaction is probably something we need coverage on too.

tl;dr, the finally needs to guard the __setup_called and __teardown_called resetting, only. Or move it to the start of run() or something.

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: 0 of 4 files reviewed at latest revision, 3 unresolved discussions.

testtools/testcase.py, line 644 [r2] (raw file):

Added a scenario with this case
Moved cleanup to the start of run

_testtools/tests/test_testcase.py, line 1269 [r1] (raw file):_ getDetails() clearing now happens in the reset.

Parametrized details added.

Comments from the review on Reviewable.io

jml commented 8 years ago

Build failure is unrelated to this branch and is fixed by #170. Please take a look at this branch.

Review status: 0 of 4 files reviewed at latest revision, 3 unresolved discussions, some commit checks broke.

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

I think this is a big improvement. I would really love to see a matrix on the deterministic tests - something as simple as:

stages = [('setup', {'stage':'setup'}),('body', {'stage':'body'}),('teardown', {'stage':'teardown'}),('cleanups', {'stage':'cleanups'})]
states = [('pass', {'state':'pass'}),('fail', {'state':'fail'}),('error', {'state':'error'}),('skip', {'state':'skip'}),('xfail', {'state':'xfail'}),]

class Exhaustive(TestCase):

    class Harness(TestCase):

        def __init__(self, testmethodName, stage, state):
            self.stage = stage
            self state = state
            super(Harness, self).__init__(testmethodName)
        def outcome(self):
            if self.state == 'pass':
                return
            if self.state == 'fail':
                self.fail('failed')
            if self.state == 'error':
                1/0
            if self.state == 'skip':
                self.skip('skipped')
            if self.state == 'xfail':
                self.expectFailure("expected", self.assertNotEqual, 1, 0)
        def test_case(self):
            if self.stage != 'body':
                return
            self.outcome()
        def setUp(self):
            super(Harness, self).setUp()
            if self.stage == 'cleanups':
                self.addCleanup(self.outcome)
            if self.stage != 'setup':
                return
            self.outcome()
        def tearDown(self):
            super(Harness, self).tearDown()
            if self.stage != 'teardown':
                return
            self.outcome()

    scenarios = multiply_scenarios(stages, states)

    def test_scenarios(self):
        case = Harness('test_case', self.stage, self.state)
        ... run it and cross check events

---

Review status: 0 of 4 files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.

---

*[testtools/testcase.py, line 205 \[r3\]](https://reviewable.io:443/reviews/testing-cabal/testtools/165#-K4esHtkdspX6r5Dl5R1:-K4esHtkdspX6r5Dl5R2:776674581) ([raw file](https://github.com/testing-cabal/testtools/blob/c88d030f59fe707a889c4d5e7f6c8da9e3f97623/testtools/testcase.py#L205)):*
These three things (2 above, one below) should go in _reset too I think. The id_gens one is already there, redundantly.

---

*[testtools/tests/samplecases.py, line 45 \[r3\]](https://reviewable.io:443/reviews/testing-cabal/testtools/165#-K4esoeg_tEzvnwiT91d:-K4esoeg_tEzvnwiT91e:1153736470) ([raw file](https://github.com/testing-cabal/testtools/blob/c88d030f59fe707a889c4d5e7f6c8da9e3f97623/testtools/tests/samplecases.py#L45)):*
"fails to call ``tearDown`` when the global state breaks but calls it after that.

---

*Comments from the [review on Reviewable.io](https://reviewable.io:443/reviews/testing-cabal/testtools/165#-:-K4etR_ezfEPGaYrpFRK:1911037791)*
<!-- Sent from Reviewable.io -->

jml commented 8 years ago

Ah, OK, you want the full monty.

I've used your approach as a starting point (thanks so much!) and have gone for a more functional approach.

Also (unless I'm misreading), the approach outlined here only every has stuff going on in one stage per test. The patch I'm about to push (I think) does all combinations of all stages. This more than doubles the number of tests in our suite!

Review status: 0 of 4 files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.

testtools/testcase.py, line 205 [r3] (raw file): Moved _unique_id_gen and _traceback_id_gens. It's definitely cleaner. FWIW I held off before because I couldn't find a test to drive the change.

Moving self._cleanups = [] to _reset causes TestAddCleanup to fail. Currently we allow users to call addCleanup on a constructed TestCase before running and for those cleanups to be applied in the run. Moving it to _reset causes them to be cleared before the run.

I'm mildly averse to this change, because it breaks compatibility.

_testtools/tests/samplecases.py, line 45 [r3] (raw file):_ Done.

Comments from the review on Reviewable.io

rbtcollins commented 8 years ago

Review status: 0 of 4 files reviewed at latest revision, 5 unresolved discussions.

testtools/testcase.py, line 204 [r4] (raw file): Ok so - I see your argument that we used to allow injecting cleanups before run()... but we never intended to support that, did we? Its clearly a skew from the contract of 'run runs the test, and tests are hermetic [module setupClass (OMG) and setupModule (crying into my wine). I think making this more hermetic is strictly good, and no less a break that our requiring that setUp be called.

e.g. our tests were using a naughty shortcut, naughty tests.

testtools/tests/samplecases.py, line 50 [r4] (raw file): All of the stages $what?

testtools/tests/samplecases.py, line 84 [r4] (raw file): Perhaps this should just be _do_nothing = _success ?

testtools/tests/samplecases.py, line 104 [r4] (raw file): Or alternative change this to use _do_nothing - either way this is missing ", case" near the right hand end of the line... (and if tests are passing, there is something wrong :)

Comments from the review on Reviewable.io

jml commented 8 years ago

Review status: 0 of 4 files reviewed at latest revision, 5 unresolved discussions.

_testtools/testcase.py, line 204 [r4] (raw file):_ It's not just our tests. Nevertheless, changed. Significant updates to TestAddCleanup and TestPatchSupport.

[testtools/tests/samplecases.py, line 50 [r4]](https://reviewable.io:443/reviews/testing-cabal/testtools/165#-K4tA784s8GGGffzmVi2:-K4wF5tud2L2x2inwuz:594663406) (raw file):_ I've updated the docstring, and added one to the constructor as well.

testtools/tests/samplecases.py, line 84 [r4] (raw file): Changed to _success = _do_nothing.

testtools/tests/samplecases.py, line 104 [r4] (raw file): Kept _success because I think that more clearly communicates intent.

Added case parameter—good catch.

The tests were passing, but that's not necessarily an indication that something was wrong. A test that makes an invalid call to expectFailure is treated as erroring. By getting _unexpected_success wrong, we were effectively exercising the "error" case twice.

Comments from the review on Reviewable.io

jml commented 8 years ago

AFAICT, this only needs a rubber stamp to confirm that the last round of changes satisfy the last round of review.

rbtcollins commented 8 years ago

Two really minor nits; I think its good to go, with or without addressing them.

testing-cabal / testtools

Allow tests to be run more than once #165