Add support for using twisted clock in tests

sveinse commented 4 years ago

When testing twisted based functions, it is often desirable to be able to test with deterministic and controllable passage of time. Twisted's task.Clock provides this. If the function under test is using callLater or similar methods to create a delay, time can be fast-forwarded when the reactor is idle.

I'd like to propose adding functionality to pytest-twisted that allows for using Clock to accelerate passage of time during testing. Coincidentally it will be comparable to pytest-trio's autojump_clock, see https://pytest-trio.readthedocs.io/en/latest/quickstart.html#trio-s-magic-autojump-clock

I am currently working on some concepts and will discuss them here.

sveinse commented 4 years ago

Option 1 - timetravel decorator

In pytest_twisted.py adding the following:

@decorator.decorator
def timetravel(fun, clock=None, *args, **kw):
    if not clock:
        raise AttributeError("Must provide clock to timetravel")
    async def runner(*args, **kw):
        d = defer.ensureDeferred(fun(*args, **kw))

        while 1:
            calls = clock.getDelayedCalls()
            if d.called:
                return await d
            if not calls:
                raise RuntimeError("twisted reactor idle")

            amount = calls[0].time - clock.seconds()
            clock.advance(amount)

    return runner(*args, **kw)

This is use in a test like this:

import pytest_twisted as pt
from twisted.internet import task

reactor = task.Clock()

@pt.ensureDeferred
@pt.timetravel(clock=reactor)
async def test_one():
    clock = reactor
    # Do something...
    await task.deferLater(clock, 1)
    # Do something else...

This will run the test using the clock reactor specified in @pt.timetravel(clock=reactor). It runs the test until the test is completed, progressing time automatically. It will raise an exception if the reactor becomes idle, preventing a lockup of the test execution.

The downside to this proposal is that the clock must be provided as an argument to the decorator. The same clock must be used by the function under test, so it must also be available inside the test function. The example above does this using global variable reactor.

sveinse commented 4 years ago

Option 2 - fixture decorator

The test itself, test_one, can be written as a regular twisted test, specifying fixtures as required. The clock fixture uses the decorator @pt.auto_clock(). This is an indicator that fixture provides a twisted clock, and that it shall execute the test with automatics time advancement.

@pt.auto_clock()
async def clock():
    return task.Clock()

@pt.ensureDeferred
async def test_one(clock):
    # Do something...
    await task.deferLater(clock, 1)
    # Do something else...

The pytest_twisted.py is too big to post here, but I've created a dummy PR for it: https://github.com/sveinse/pytest-twisted/pull/1

PRO

Lets the user name the fixture of own choosing
No global vars are involved
The full freedom of implementing the fixture

CON

Is the fixture decorator a little subtle for its effect on the test?
Currently the decorator cannot be combined with other pytest_twisted decorators

altendky commented 4 years ago

Still doesn't seem like a big deal after looking at it for a first time. It's good to have a couple options to think through though. I'll try to take another pass tomorrow, I'm cleaning up #91 now to hopefully clear out the queue and bring myself back up to speed on how this stuff works.

One question, can we keep an eye out for non-time related tasks being scheduled in the clock? Or maybe it does already? I'm thinking that (maybe...) it would be useful to at least warn people if their clock tests include non-deterministic activity. Actual network requests or such. Or, maybe I'm totally off-base here and this doesn't make sense. It is a bit late...

sveinse commented 4 years ago

Clock doesn't have methods for other reactor operations than time. E.g. it is not possible to do a clock.listenTCP() with it, so any IO isn't possible. (Technically Clock only implements IReactorTime, while an ordinary reactor implements all interfaces IReactorTime, IReactorTCP and so on.)

There are two conditions that will terminate a clock based test: Either the test function is run to completion, or the clock becomes idle. Idle will raise an exception to avoid the testing getting stuck.

If a user has some IO in its test, it will have to be using the "real" reactor which pytest_twisted is running and not clock. Thus clock is oblivious to it. Use cases like this will most likely end up with the clock reactor ending up in idle and then an exception.

altendky commented 4 years ago

In case you have thoughts on this already, are there use cases for not-automatic time progression? Or should the tests create events at the times they are curious about instead of manually ticking time to a point then checking. Would there be conflicts between auto-ticking and manual ticking beyond the normal 'if you set the time past all events they all happen at once and late'.

sveinse commented 4 years ago

Using auto progression of clock is a test design decision and it should be something the test opts in. There is nothing here preventing a user from having manual clock advancement. This is the methodology basically all twisted testing. For the proposed option 2 above, the only change would be to remove the @pt.auto_clock() decorator and the test can advance its clock manually.

Somewhat unrelated to the clock progression, but please note that the usage of await in tests can always stall a test if the awaited object never returns. In a traditional Deferred() type approach, the function run by the deferred might still not return, but it doesn't affect the progression of the test execution. Auto time-progression aids in the usage of awaits in testing by "handling" the progress of time (given that the awaited function will return eventually).

sveinse commented 4 years ago

To show why this functionality makes sense, I would like to tell the story of how this proposal came to be.

Consider the following blackbox function that we would like to test. It is a time-based function and we would like to test it deterministically as running it in real-life would. The helper Order is a class to record the order of the execution.

class Order:
    def __init__(self, clock):
        self.clock = clock
        self.startat = clock.seconds()
        self.d = []
    def mark(self, name, text=None):
        print(f"{self.clock.seconds()-self.startat:0.3f}: {name}")
        self.d.append(name)

async def subtask1(clock, o):
    o.mark('S1 IN')
    await task.deferLater(clock, 4)
    o.mark('S1 OUT')

async def subtask2(clock, o):
    o.mark('S2 IN')
    await task.deferLater(clock, 1)
    o.mark('S2 OUT')

async def blackbox(clock, o):
    o.mark('A')
    await task.deferLater(clock, 1)
    o.mark('B')
    d1 = defer.ensureDeferred(subtask1(clock, o))
    o.mark('C')
    await task.deferLater(clock, 2)
    o.mark('D')
    d2 = defer.ensureDeferred(subtask2(clock, o))
    o.mark('E')
    await d2
    o.mark('F')
    await d1
    o.mark('DONE')

The test run in real-life with a normal reactor takes 5 seconds to complete blackbox.

correct_order = ['A', 'B', 'S1 IN', 'C', 'D', 'S2 IN', 'E', 'S2 OUT', 'F', 'S1 OUT', 'DONE']

@pt.ensureDeferred
async def test_irl():
    clock = reactor
    o = Order(clock)

    await blackbox(reactor, o)
    assert o.d == correct_order

The first attempt is how to use clock to replace the reactor to test blackbox. The await makes the test halt, because it will be stuck on an internal await inside blackbox. Thus it is not possible to advance the clock manually with this setup. await is unusable.

@pt.ensureDeferred
async def test_stuck():
    clock = task.Clock()
    o = Order(clock)

    await blackbox(clock, o)
    assert o.d == correct_order

The second attempt at clock time progression. It is known that the blackbox takes 5 time units to run, so let the test start blackbox, advance time by 5 and check its status. -- It fails. By time 5 is has not completed and if the steps are inspected it will show that is has run A, B, S1 IN, C. The root cause is that blackbox has hidden chained awaits that depends on certain increase of time.

@pt.ensureDeferred
async def test_advance5():
    clock = task.Clock()
    o = Order(clock)

    d = defer.ensureDeferred(blackbox(clock, o))
    assert not d.called
    clock.advance(5)
    assert d.called
    assert o.d == correct_order

If a naive approach is taken to advancing the time, the time can be advanced as many times as the function under test needs. Empirically learned, it turns out that blackbox requires 3 advances to return. However, it turns out that it has executed the steps in a different order than in real-life and the test fails.

@pt.ensureDeferred
async def test_manyadvances():
    clock = task.Clock()
    o = Order(clock)

    d = defer.ensureDeferred(blackbox(clock, o))
    clock.advance(5)
    clock.advance(5)
    clock.advance(5)
    assert d.called
    assert o.d == correct_order

In order to execute blackbox in the proper order, the test must advance the time in the same order as the design of it requires. Hence, the test must have internal knowledge of blackbox, which is not a good test implementation. For blackbox, advancing time by 1, 2, 1, 1 will produce the correct result.

@pt.ensureDeferred
async def test_manual():
    clock = task.Clock()
    o = Order(clock)

    d = defer.ensureDeferred(blackbox(clock, o))
    clock.advance(1)
    clock.advance(2)
    clock.advance(1)
    clock.advance(1)
    assert d.called
    assert o.d == correct_order

Given a hidden test, how is it possible for the test to know what time advances it needs? One method could be to inspect the queue of the waiting calls in the reactor and advance time as needed. When blackbox is done, the result and expected time can be asserted. -- Incidentally this is the implementation of the auto_clock mechanism.

@pt.ensureDeferred
async def test_manual_auto():
    clock = task.Clock()
    o = Order(clock)

    d = defer.ensureDeferred(blackbox(clock, o))
    while 1:
        calls = clock.getDelayedCalls()
        if d.called:
            assert o.d == correct_order
            assert clock.seconds() == 5
            return

        clock.advance(calls[0].time - clock.seconds())

And this is where we arrive at the auto_clock feature. The following test is functionally equivalent of the above test. Note that it allows for the reintroduction of using await in the test.

@pt.auto_clock()
async def clock():
    return task.Clock()

@pt.ensureDeferred
async def test_auto_clock(clock):
    b = clock.seconds()
    o = Order(clock)

    await blackbox(clock, o)
    assert o.d == correct_order
    assert clock.seconds() == 5

sveinse commented 4 years ago

Await in testing

The usage of await will set certain conditions of the function being tested as await has the potential to lock up the test. All functions that are awaited for, must return deterministically, e.g. by the usage of timeouts. This is a inherent property of using await, not about the nature of the reactor or clock or the auto_clock feature. pytest_twisted might need some overall timeout mechanisms for ensuring that a test doesn't lock up indefinitely.

The auto_clock feature partially avoids this problem because if a time-based function because idle it will raise an exception. Idle in a time-based function is when it waits on something non-time based, such as an regular Deferred.

The usage of deferreds (instead of awaits) with manual advancement of time has a great advantage in that it is easy to set test boundaries of time. E.g. if a test hasn't returned by time 10, it can be classified as a failure.

In auto_test handling stuck function isn't as easy. Lets say a function is stuck in a while True: sleep() loop, the auto_clock will not catch it and iterate through infinity. Ergo all tests using halting await must have timeouts. And all tests using auto_clock must have timeouts built in for the same reason.

altendky commented 4 years ago

If you literally mean a sync sleep then sure. This is not unique to await or async even. If you mean an async sleep then shouldn't it be easy enough to have 'the framework' toss an event in part way and when the auto clock gets there 'the framework' can stop and result in an error or failure? This is admittedly not addressing any implementation details but it seems like just another piece to add. general timeout abilities would be a nice addition as well.

sveinse commented 4 years ago

I'm sorry, I didn't mean literal sleep, I meant while True: await task.deferLater(clock, 1). In testing the following code will address the issue, but it's a little bit less elegant than just calling await function_under_test().

    d = defer.ensureDeferred(function_under_test())
    d.addTimeout(timeout, reactor)
    await d

pytest_trio discusses this topic here https://pytest-trio.readthedocs.io/en/latest/reference.html#trio-mode. They have a method named with trio.move_on_after(timeout) which provides similar features.

altendky commented 4 years ago

I was imagining a parameter to the test decorator to timeout the whole thing but there could also be more granular timeouts inside the test. I think...

altendky commented 4 years ago

Let me start with some gut feelings about a normal usage interface. The test needs the clock inside itself and fixtures are the normal means of dependency injection in pytest. So far pytest-twisted only provides twisted_greenlet. I would think that for most cases a standard clock from Twisted could be provided by a new fixture in pytest-twisted. This also would mean that the clock to be used is already 'in our system'. I agree with your concern about simply adding a fixture causing such a drastic change to how your test is run as to run it in the clock. I can see a few options. Explicit clock decorators instead of existing @pytest_twisted.ensureDeferred etc, extra stackable decorators, an option added to the existing decorators.

So sure, I don't offhand know when someone would want another clock but it would be nice to leave the option open. The fixture half of it is simple. The developer is specifying the fixture they want to use for a test so if they specify their own they get their own. Anything pytest-twisted provides is just a handy 'default'. From the decorator side, whichever approach we might use, we get into the idea of marking the fixture somehow, just as you mentioned.

From the perspective of actually running the test... hmm, we've already got a couple ways and we're adding another. Perhaps it is time to take the present hard-coded runners and make them more modular and 'pluggable'? I was thinking about adding timeouts and one of the first things was to have a way to specify the timeout. At present the marks are just a strings. They could as well be anything so I was considering switching them to class instances and the classes could have whatever attributes were needed, such as the timeout. So rather than @pytest_twisted.ensureDeferred just adding an 'async_test' string it could add a class instance with the timeout. And on the present topic here... also a runner function? This moves the running logic out of the core code and makes it something that someone can provide a custom function for. Well, if we make the mark interface public. Then you could have implemented this all nicely without even talking to me. :] But sure, I still expect to try to see this proposal through to inclusion. Anyways, let's leave making this a public interface for later. We can implement a mechanism and see how it works out before making any such promises.

Alrighty, so I think that #103 plays into this a bit. The adding of timeouts for tests, though I guess I haven't thought about how much an async test needs a different timeout mechanism than a sync one. Well, existing async tests. A clock test would want a non-wall-time based timeout in at least some cases. Then a new activity to break out the means by which tests are actually run. I don't mean to make a big list of things you have to wait for, rather that I think I'm breaking out pieces that will make the implementation tidy. Hopefully... or, just mash it all in internally for now and break out later. It's not like the present runner code is complicated and given that it is not public we can change up that design any time we want. So yeah, probably go with that but with these thoughts of enhanced flexibility in mind.

I do apologize for not being more responsive. I appreciate your efforts and am trying to keep moving at least on pieces that I think lead towards getting this 'done'. Thanks. :]

sveinse commented 4 years ago

For timeouts there are two aspects to consider:

Any Deferred().addTimeout(timeout, clock) will work identically when using clock as when using a real reactor: When the time hits timeout, it will fire TimeoutError. In this respect, it is equal if a test is timing out on either clock or reactor. What you don't know up front is how long wall time the test will take until it hits the timeout.

The other aspect, is absolute test timeout. Any async tests might lock up, e.g. by await something that doesn't happen. In this case time advancement with clock will stall too and any future timeout installed on clock will never fire. A test should probably have support for setting a timeout which is added to the test function deferred running on the real reactor and not the clock.

edit Be advised that any Deferred().addTimeout() can be intercepted by the task running receiving the exception. So it is not an guaranteed way to ensure that the tests will progress.

altendky commented 4 years ago

How would an await block a timeout?

altendky commented 4 years ago

As in if pytest-twisted tosses it's own activity into the reactor/clock with a time to run, it seems it would happen unless there's a sync blocking activity.

sveinse commented 4 years ago

To repeat some of the things discussed in #python yesterday:

By locking up, I meant halting of the progress of the test and indirectly halting the progress of running the next tests. Async code will never technically lock -- as opposed to what a busy-loop or sync call might.

@pt.ensureDeferred()
async def test_lock():
    d = Deferred()
    await d

This will stall execution of the test, but it is recoverable. If using the ordinary reactor, the test execution will halt indefinitely unless a timeout is installed. If clock is used, the execution will be returned to the clock runner when await d is called. Since the runner is oblivious to the deferred, it attempts to advance time. If no timeout is installed, it will have nothing more to do and it will raise an exception. If a timeout is installed, it will advance time and fire the timeout.

@pt.ensureDeferred()
async def test_lock():
    while True:
        try:
            d = Deferred()
            await d
        except CancelledError:
            pass

In the inner working of pytest_twisted one must be careful where timeouts are placed. Lets say we have the above test function and somewhere in pytest_twisted the test and timeout is setup like this:

d = ensureDeferred(test_lock)
d.addTimeout(timeout, reactor)

This code will fail to timeout and/or cancel the test task, because the way errback are managed in Twisted. -- Anyways, I agree that timeout is an important task of testing and its coupled to the clock/reactor scheme being used, but I think perhaps this belongs in a separate issue?

pytest-dev / pytest-twisted