python / cpython

The Python programming language
https://www.python.org
Other
62.51k stars 30.01k forks source link

with statements are not ensuring that __exit__ is called if __enter__ succeeds #74174

Closed njsmith closed 2 years ago

njsmith commented 7 years ago
BPO 29988
Nosy @rhettinger, @gpshead, @ncoghlan, @nirs, @njsmith, @markshannon, @embray, @serhiy-storchaka, @jdemeyer, @1st1, @ryanhiebert, @xgid, @jack1142
PRs
  • python/cpython#1799
  • python/cpython#18334
  • Dependencies
  • bpo-31344: f_trace_opcodes frame attribute to switch to per-opcode tracing
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['interpreter-core', 'type-bug', '3.9', '3.10'] title = 'with statements are not ensuring that __exit__ is called if __enter__ succeeds' updated_at = user = 'https://github.com/njsmith' ``` bugs.python.org fields: ```python activity = actor = 'Mark.Shannon' assignee = 'none' closed = True closed_date = closer = 'Mark.Shannon' components = ['Interpreter Core'] creation = creator = 'njs' dependencies = ['31344'] files = [] hgrepos = [] issue_num = 29988 keywords = ['patch'] message_count = 60.0 messages = ['291148', '291157', '294272', '294296', '294400', '294402', '294404', '294434', '297177', '297179', '297180', '297182', '297185', '297186', '297189', '297190', '301235', '301248', '301274', '301277', '301278', '301285', '301289', '301360', '301429', '301553', '301557', '301566', '301573', '301601', '301609', '301610', '301611', '301620', '301630', '301660', '301667', '301668', '301672', '301675', '301682', '301698', '301717', '301721', '301771', '301869', '312874', '321225', '321318', '350123', '350134', '352026', '352039', '352199', '352212', '352457', '353415', '372499', '386711', '389922'] nosy_count = 15.0 nosy_names = ['rhettinger', 'gregory.p.smith', 'ncoghlan', 'nirs', 'njs', 'Mark.Shannon', 'erik.bray', 'serhiy.storchaka', 'jdemeyer', 'yselivanov', 'deleted0524', 'ryanhiebert', 'xgdomingo', 'jack1142', 'Ami Fischman'] pr_nums = ['1799', '18334'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue29988' versions = ['Python 3.9', 'Python 3.10'] ```

    njsmith commented 7 years ago

    You might hope the interpreter would enforce the invariant that for 'with' and 'async with' blocks, either '__(a)enter' and '__(a)exit' are both called, or else neither of them is called. But it turns out that this is not true once KeyboardInterrupt gets involved – even if we write our (async) context manager in C, or otherwise guarantee that the actual '__(a)enter' and '__(a)exit' methods are immune to KeyboardInterrupt.

    The invariant is *almost* preserved for 'with' blocks: there's one instruction (SETUPWITH) that atomically (wrt signals) calls '\_enter' and then enters the implicit 'try' block, and there's one instruction (WITH_CLEANUPSTART) that atomically enters the implicit 'finally' block and then calls '\_exit'. But there's a gap between exiting the 'try' block and WITH_CLEANUP_START where a signal can arrive and cause us to exit without running the 'finally' block at all. In this disassembly, the POP_BLOCK at offset 7 is the end of the 'try' block; if a KeyboardInterrupt is raised between POP_BLOCK and WITH_CLEANUPSTART, then it will propagate out without '\_exit__' being run:

    In [2]: def f(): ...: with a: ...: pass ...:

    In [3]: dis.dis(f) 2 0 LOAD_GLOBAL 0 (a) 3 SETUP_WITH 5 (to 11) 6 POP_TOP

    3 7 POP_BLOCK 8 LOAD_CONST 0 (None) >> 11 WITH_CLEANUP_START 12 WITH_CLEANUP_FINISH 13 END_FINALLY 14 LOAD_CONST 0 (None) 17 RETURN_VALUE

    For async context managers, the race condition is substantially worse, because the 'await' dance is inlined into the bytecode:

    In [4]: async def f(): ...: async with a: ...: pass ...:

    In [5]: dis.dis(f) 2 0 LOAD_GLOBAL 0 (a) 3 BEFORE_ASYNC_WITH 4 GET_AWAITABLE 5 LOAD_CONST 0 (None) 8 YIELD_FROM 9 SETUP_ASYNC_WITH 5 (to 17) 12 POP_TOP

    3 13 POP_BLOCK 14 LOAD_CONST 0 (None) >> 17 WITH_CLEANUP_START 18 GET_AWAITABLE 19 LOAD_CONST 0 (None) 22 YIELD_FROM 23 WITH_CLEANUP_FINISH 24 END_FINALLY 25 LOAD_CONST 0 (None) 28 RETURN_VALUE

    Really the sequence from 3 BEFORE_ASYNC_WITH to 9 SETUP_ASYNC_WITH should be atomic wrt signal delivery, and from 13 POP_BLOCK to 22 YIELD_FROM likewise.

    This probably isn't the highest priority bug in practice, but I feel like it'd be nice if this kind of basic language invariant could be 100% guaranteed, not just 99% guaranteed :-). And the 'async with' race condition is plausible to hit in practice, because if I have an '__aenter__' that's otherwise protected from KeyboardInterrupt, then it can run for some time, and any control-C during that time will get noticed just before the WITH_CLEANUP_START, so e.g. 'async with lock: ...' might complete while still holding the lock.

    The traditional solution would be to define single "super-instructions" that do all of the work we want to be atomic. This would be pretty tricky here though, because WITH_CLEANUP_START is a jump target (so naively we'd need to jump into the "middle" of a hypothetical new super-instruction), and because the implementation of YIELD_FROM kind of assumes that it's a standalone instruction exposed directly in the bytecode. Probably there is some solution to these issues but some cleverness would be required.

    A alternative approach would be to keep the current bytecode, but somehow mark certain stretches of bytecode as bad places to run signal handlers. The eval loop's "check for signal handlers" code is run rarely, so we could afford to do relatively expensive things like check a lookaside table that says "no signal handlers when 13 \< f_lasti \<= 22". Or we could steal a bit in the opcode encoding or something.

    ncoghlan commented 7 years ago

    My first thought would be to inject a wordcode instruction that's essentially an eval loop directive: "ATOMIC_UNTIL \<offset>"

    ATOMIC_UNTIL would have at least the following properties:

    It may also have the following defensive coding property:

    The last bit would likely have undesirable performance implications, so it would probably be reasonable to only enable the check for debug builds, rather than always enforcing it.

    4d51690f-22f4-4fda-a9f5-b3c6ff59552a commented 7 years ago

    This is either a "won't fix" or an "impossible to fix" depending on your point of view.

    PATIENT: It hurts whenever I press ctrl-C DOCTOR: Then don't press ctrl-C

    The problem is that ctrl-C can provoke an interrupt at any point in the program, and thus break presumed invariants. try-finally has the same problem.

    From the observable side effects it is indistinguishable whether an interrupt occurs after the last instruction in an __enter function or after the first (side-effect-less) instruction after the __enter. Likewise the last pre-exit and first in-exit instructions are effectively the same from the point of view from observable side-effects, but are different from the point of view of the handling of exceptions.

    njsmith commented 7 years ago

    Right, fixing this bug alone can't make programs control-C safe in general. But there exist techniques to make __enter/exit__ methods safe WRT control-C, and if we fix this bug *and* apply those techniques then you can get some meaningful guarantees.

    For example, threading.Lock's __enter and __exit methods are written in C, so they are guaranteed to happen atomically WRT to control-C. Right now, it's possible for a 'with lock: ...' to raise KeyboardInterrupt with the lock still held, which is surprising and not what we would like. If this bug were fixed, then it would be guaranteed that the lock was always released.

    (And I originally discovered this while implementing control-C handling in trio, which is pure-Python but also has a way to protect __(a)enter/(a)exit__ methods against control-C. ...though that won't be 100% reliable until bpo-12857 or similar is fixed. See https://vorpus.org/blog/control-c-handling-in-python-and-trio/ for exhaustive details.)

    markshannon commented 7 years ago

    If all you need is that

    with foo:
       pass

    guarantees that either both or neither of __enter and __exit are called, for C context managers, and only C context managers, then the fix is trivial.

    To protect Python code would need a custom context manager wrapper

    with ProtectsAgainstInterrupt(user_ctx_mngr()):
        do_stuff()

    ProtectsAgainstInterrupt would need to be implemented in C and install a custom signal handler.

    markshannon commented 7 years ago

    Nathaniel,

    Do you have any way to reliably test for this failure mode?

    njsmith commented 7 years ago

    If all you need is that with foo: pass guarantees that either both or neither of __enter and __exit are called, for C context managers, and only C context managers, then the fix is trivial.

    It would be nice to have it for 'async with foo: pass' as well, which is a little less trivial because the 'await' dance is inlined into the bytecode (see the initial post in this bug), but basically yes.

    Do you have any way to reliably test for this failure mode?

    Unfortunately no, I haven't implemented one. Let's see, though...

    The test I wrote for bpo-30039 demonstrates one way to trigger a signal on a precise bytecode, by writing some code in C that calls raise(SIGNALNUMBER), and then calling it immediately before the bytecode where we want the signal to be raised (simulating the situation where the signal happens to arrive while the C function is running -- note that raise() is convenient because unlike kill() it works cross-platform, even on Windows).

    This might be sufficient for testing the 'async with' version; it looks like an __await__ method or iterator implemented in C and calling raise() would deliver a signal at a point that should be protected but isn't.

    The tricky case is plain 'with'. We can write something like:

    with lock:
        raise_signal()

    and this gives bytecode like:

    1 0 LOAD_NAME 0 (lock) 2 SETUP_WITH 12 (to 16) 4 POP_TOP

    2 6 LOAD_NAME 1 (raise_signal) 8 CALL_FUNCTION 0 10 POP_TOP 12 POP_BLOCK 14 LOAD_CONST 0 (None) >> 16 WITH_CLEANUP_START

    So the problem is that at offset 8 is where we can run arbitrary code, but the race condition is if a signal arrives between offsets 12 and 16.

    One possibility would be to set up a chain of Py_AddPendingCall handlers, something like:

    int chain1(void* _) {
        Py_AddPendingCall(chain2, 0);
        return 0;
    }
    int chain2(void* _) {
        Py_AddPendingCall(chain3, 0);
        return 0;
    }
    int chain3(void* _) {
        raise(SIGINT);
        return 0;
    }

    (or to reduce brittleness, maybe use the void* to hold an int controlling the length of the chain, which would make it easy to run tests with chains of length 1, 2, 3, ...)

    ......except consulting ceval.c I see that currently this won't work, because it looks like if you call Py_AddPendingCall from inside a pending call callback, then Py_MakePendingCalls will execute the newly added callback immediately after the first one returns. It could be made to work by having Py_MakePendingCalls do a first pass to check the current length of the queue, and then use that as the bound on how many calls it makes.

    njsmith commented 7 years ago

    On further thought, I think the way I'd write a test for this is:

    (1) add a testing primitive that waits for N instructions and then injects a SIGINT. Probably this would require tweaking the definition of Py_MakePendingCalls like I described in my previous comment, and then some code like:

    int sigint_timebomb(void* count_ptr) {
        intptr_t count = (intptr_t) count_ptr;
        if (count == 0)
            raise(SIGINT);
        else
            Py_AddPendingCall(sigint_timebomb, (void*) (count - 1));
        return 0;
    }

    (2) write a test like:

    reached_crashpad = False
    lock = threading.Lock()
    i = 0
    while not reached_crashpad:
        i += 1
        try:
            sigint_timebomb(i)
            with lock:
                1 + 1
            # this loop keeps executing instructions until any still-armed
            # timebombs go off
            while True:
                reached_crashpad = True
        except KeyboardInterrupt:
            # key invariant: signal can't leave the lock held
            assert not lock.locked()
        else:
            assert False  # can't happen

    The idea here is that this sets off SIGINTs at every possible place throughout the execution of the 'with lock', while checking that the lock is always released, and without needing to hard code any information at all about what bytecode the compiler generated.

    jdemeyer commented 7 years ago

    Nice analysis. I always assumed that with was safe from such race conditions, but it isn't.

    ncoghlan commented 7 years ago

    One testing possibility worth considering: perhaps it would be make sense to have "start_try" and "start_finally" trace events, such that a trace function could be installed that injected signals at the worst possible time in the opcode evaluation?

    I haven't fully thought through the possible implications of that approach, but it seems like it would be less fragile than counting code offsets.

    ncoghlan commented 7 years ago

    Sorry, I misread Nathaniel's suggestion - the exhaustive search of all possible offsets already avoids the need to hardcode any code generation dependent details in the tests. So the trace function approach would just avoid the need for that search.

    jdemeyer commented 7 years ago

    Or we could steal a bit in the opcode encoding or something.

    That seems like a very reasonable and easy-to-implement solution. It would generalize this check: https://github.com/python/cpython/blob/e82cf8675bacd7a03de508ed11865fc2701dcef5/Python/ceval.c#L1067-L1071

    (@Nathaniel Smith: nice blog post!)

    embray commented 7 years ago

    > Or we could steal a bit in the opcode encoding or something.

    That seems like a very reasonable and easy-to-implement solution. It > would generalize this check: https://github.com/python/cpython/blob/e82cf8675bacd7a03de508ed11865fc2701dcef5/Python/ceval.c#L1067-L1071

    Interesting; I didn't notice that bit before. It seems like that does at least try to guarantee that a signal can't interrupt between:

    lock.acquire()
    try:
        ...

    which previously I assumed we couldn't make any guarantees about. But CPython at least does, or tries to. It would be good to generalize that.

    jdemeyer commented 7 years ago

    It seems like that does at least try to guarantee that a signal can't interrupt between:

    lock.acquire() try: ...

    Actually, I think it's between the end of the try and the beginning of the finally (which is precisely the same place that breaks for a with statement).

    embray commented 7 years ago

    Actually I just finished reading njs's blog post, and as he points out that special case for SETUP_FINALLY is basically broken. There are other cases where it doesn't really work either. For example if you have:

    if ...:
        do_something()
    else:
        do_something_else()
    try:
        ...
    finally:
        ...

    then (ignoring the issue about POP_TOP for a moment) the last instruction in *one* of these branches if followed immediately by SETUP_FINALLY, while in the other branch there's a JUMP_FORWARD, then the SETUP_FINALLY.

    All the more reason for a more generic solution to this that doesn't depend strictly on the next op locally in the byte code.

    jdemeyer commented 7 years ago

    Actually, I think it's between the end of the try and the beginning of the finally (which is precisely the same place that breaks for a with statement).

    Sorry, this is wrong and Erik was right. But the special case doesn't work anyway, so it doesn't really matter.

    ncoghlan commented 7 years ago

    Greg Smith & I are looking at this at the core dev sprint, and we think some variant of the "atomic until" idea should work, but there's a prerequisite change to the way "async with" works: the "GETAWAITABLE" opcodes need to be avoided in this case, as they call \_await__, and hence may run arbitrary Python code.

    We can't see any immediate barriers to moving those calls up into BEFORE_ASYNC_WITH, such that what ends up on the eval loop's stack is the already resolved iterable for use by YIELD FROM, rather than combining GET_AWAITABLE+YIELD_FROM the way a normal await expression does.

    That would then give the preamble:

    BEFORE_ASYNC_WITH (resolves __aenter__ and __aexit__ to iterables)
    LOAD_CONST 0
    YIELD_FROM (need to skip signal processing here)
    SETUP_ASYNC_WITH

    And the postamble:

    POP_BLOCK (need to skip signal processing until YIELD_FROM)
    LOAD_CONST 0
    WITH_CLEANUP_START
    LOAD_CONST 0
    YIELD_FROM
    WITH_CLEANUP_FINISH

    We also agree that adding some kind of test injection hook (potentially limited to pydebug builds, depending on exactly how it works) is likely to be a necessary to be able to test this.

    gpshead commented 7 years ago

    Yury's in the room here and pointed out how Nick and I are wrong. :) [yay sprints!]

    async def __aexit__(self, *e):
      spamity_spam
      return Awaitable

    If we resolve the GET_AWAITABLE at BEFORE_ASYNC_WITH time, the spamityspam within \_aexit__ is executed before the with block instead of after as happens today. :/

    See also https://www.python.org/dev/peps/pep-0492/#asynchronous-context-managers-and-async-with

    Nick is actively working on making a test.

    ncoghlan commented 7 years ago

    https://github.com/ncoghlan/cpython/pull/2/files provides a test case that reliably reproduces the problem for both synchronous and asynchronous context managers.

    It's inspired by Nathaniel's proposal above, but relies on a modified version of sys.settrace that runs the trace function after every opcode, not just every time the line number changes or we jump backwards in the bytecode.

    bpo-31344 is a separate issue to add that underlying capability so we can write this test case.

    njsmith commented 7 years ago

    This would still provide value even if you have to do a GETAWAITABLE in the protected region: the most common case is that \_aenter is a coroutine function, which means that its __await is implemented in C and already protected against interrupts.

    Also, to make this fully useful we need some way to protect arbitrary callables against interrupts anyway (to handle the case where the signal arrives during the actual __enter or __exit body), so saying that we also need to be able to protect __await__ isn't a big deal.

    ncoghlan commented 7 years ago

    Right, if you look at the comments in the draft test case, we realised there are three things we currently need to protect:

    1. POP_BLOCK -> WITH_CLEANUP_START (synchronous CM)
    2. POP_BLOCK -> GET_AWAITABLE (asynchronous CM)
    3. GET_AWAITABLE -> YIELD_FROM (asynchronous CM)

    Now that I have a test case, I'm going to start looking at defining a new DEFER_PENDING_UNTIL opcode that skips pending call processing (and hence signals) until that particular opcode offset is reached.

    gpshead commented 7 years ago

    I do wonder if this is going to raise the urgency of https://bugs.python.org/issue16565 (Increase Py_AddPendingCall array size) or other refactoring of how pending calls are tracked.

    njsmith commented 7 years ago

    In bpo-30703 Antoine fixed signal handling so it doesn't use Py_AddPendingCall anymore. At this point the only time the interpreter itself uses Py_AddPendingCall is when there's an error writing to the signal_fd, which should never happen in normal usage.

    If there are third-party libraries making heavy use of Py_AddPendingCall then it could be an issue, but any such libraries should already be making sure they never have more than one Py_AddPendingCall pending at a time, because you already have to assume that the processing might be arbitrarily delayed.

    ncoghlan commented 7 years ago

    https://github.com/ncoghlan/cpython/pull/2/files now adds a DEFER_PENDING_UNTIL opcode that relies on the jump offset calculation machinery to decide how long pending call processing should be deferred.

    Right now, the test case is also emulating the "skip pending call" logic based on a frame attribute, but I'm going to try to replace that with a suitable call to _testcapi._pending_threadfunc, which would avoid the need to expose those details on the frame, and ensure that we're testing the logic that actually matters.

    However, the good news is that:

    1. the patch is complete enough that I think it demonstrates that this approach is potentially viable
    2. because it only suspends pending call process in the *current* frame, it's safe to cross boundaries that may call arbitrary Python code (this turns out to be important for the async with case)
    ncoghlan commented 7 years ago

    I've updated the draft PR to actually raise the exception from a pending call hook, rather than emulating it in a trace hook.

    This was a bit trickier than I first expected, as it required moving the trace hook itself into C, as otherwise the "return" bytecode at the end of the trace function would trigger the just registered pending call (and the pending call processing isn't deferred in the trace function's frame).

    Right now, the revised tests are failing though - I'm assuming that's a sign that the previous fix wasn't actually sufficient, and it only looked sufficient due to the way the errors were being emulated.

    ncoghlan commented 7 years ago

    The updated PR fully resolves the synchronous CM case by switching to using threading.Lock as the test CM (as per Nathaniel's original suggestion). Without that, the pending exception was being processed as soon as __exit__() started running, so the test failed due to the lack of signal safety in the test CM itself.

    For the asynchronous case, Greg and I aren't seeing any way to resolve the situation without somehow making the pending call processing event loop aware - otherwise even if the current frame postpones dealing with the pending call, it will still get processed as soon as YIELD_FROM returns control to the asyncio event loop. The one thing we think may work is to provide a way for asyncio (et al) to configure a thread such that all pending calls are routed to the thread's event loop for handling, rather than being processed directly by the eval loop.

    njsmith commented 7 years ago

    For purposes of writing a test case, can you install a custom Python-level signal handler, and make some assertion about where it runs? I.e., the calling frame should be inside the __aexit__ body, not anywhere earlier?

    ncoghlan commented 7 years ago

    It isn't writing the test case that's the problem - the same technique we're using for the synchronous CM works for the asynchronous CM as well.

    The problem is that unlike the synchronous CM, the DEFER_PENDING_UNTIL opcode isn't sufficient to make the async CM test case *pass*, as "async with" yields control back to the event loop at the YIELDFROM points, and hence it's always going to be possible for signals to interrupt the transfer of control to and from \_aenter or __aexit, even if those methods are themselves implemented in C.

    njsmith commented 7 years ago

    To make sure I understand correctly: your concern is that the event loop is not implemented in C. So if you have this patch + an async CM that's implemented in C + the async CM never *actually* yields to the event loop, then that will be signal safe... but that last restriction is pretty silly, because what's the point of an async CM that never yields to the event loop. Right?

    I see what you mean, but I think this patch is still useful even if it doesn't protect the event loop itself. To get the full benefit, you also need some way to protect your event loop from unexpected KeyboardInterrupt, but that's something Trio already solves and asyncio potentially could. And we knew from the start that to get the *most* benefit from this patch we would also need some way to protect arbitrary chunks of Python code from KeyboardInterrupt so you could protect the __(a)exit__ method itself; needing to protect the loop too isn't a huge stretch beyond that.

    (It's possible that enough of uvloop is implemented in C to benefit from this?)

    ncoghlan commented 7 years ago

    I think you're agreeing with me - we can make synchronous context managers meaningfully more signal safe (at least for CMs implemented in C, or precompiled with Cython), but for asynchronous context managers, the only meaningful defense available is to replace the default SIGINT handler with one that routes the signal to the event loop instead of trying to handle it in the interpreter's eval loop.

    The latter approach does mean it will be more difficult to interrupt a runaway non-cooperative coroutine, but that's going to be a necessary trade-off in order to allow the event loop to reliably manage a graceful shutdown in the face of KeyboardInterrupt.

    njsmith commented 7 years ago

    FWIW trio's strategy for handling this is to install a clever signal handle that routes the signal to the event loop IF the signal arrives while the event loop is running, or while particularly sensitive code like trio.Lock.__aexit__ is running. The rest of the time it raises KeyboardInterrupt directly. So we actually can have signal safety and deal with runaway coroutines at the same time, and this patch does provide a meaningful reduction in race conditions for trio [1]. In principle there's nothing stopping asyncio or other coroutine runners from implementing a similar strategy.

    [1] actually this is currently a lie, because this patch reveals another independent race condition: there's no way for my clever signal handler to tell that it's in a sensitive function like trio.Lock.__aexit__ if it runs on the very first bytecode of the function, before the function has had a chance to mark itself as sensitive. But *that's* fixable with something like bpo-12857.

    njsmith commented 7 years ago

    (tl;dr: this patch is more awesome than you realize, thanks for working on it :-))

    ncoghlan commented 7 years ago

    Attempting to clarify what Greg & I think the right answer will be for the async context management case: https://docs.python.org/3/library/asyncio-eventloop.html#unix-signals

    In practice, that would look something like:

    >>> loop = asyncio.get_event_loop()
    >>> def sigint_handler():
    ...     raise KeyboardInterrupt
    ... 
    >>> loop.add_signal_handler(signal.SIGINT, sigint_handler)
    >>> loop.run_forever()
    Traceback (most recent call last):
        ...
    KeyboardInterrupt

    That way, dealing gracefully with KeyboardInterrupt is wholly under the event loop's control, rather than the event loop having to fight with the eval loop as to how Ctrl-C should be handled.

    ncoghlan commented 7 years ago

    I've retitled this issue to specifically cover the synchronous signal-safe context management case and filed bpo-31387 to cover making it easy to switch asyncio over to cooperative SIGINT handling (rather than the default pre-emptive handling).

    ncoghlan commented 7 years ago

    I've also filed bpo-31388 to cover providing APIs to defer signals and pending calls when running in the main thread.

    njsmith commented 7 years ago

    Should I open a new issue for the __aexit__ part of this issue then, or...? The reason neither asyncio nor trio use the approach you describe is that it makes it impossible to break out of infinite loops, and trio in particular has entirely solved this problem modulo interpreter limitations like this one.

    ncoghlan commented 7 years ago

    A new issue (depending on this one and potentially on bpo-31388) would be helpful, especially if you were able to use the testing trace hook from this PR to reproduce the problem.

    The reason I've taken the async with change out of this issue is because it's untestable given the current state of asyncio - the pending call fires as soon as the YIELD_FROM opcode is reached and control returns to the asyncio event loop. If there's a simple(ish) coroutine driver we can add to manage KeyboardInterrupt adequately for testing purposes, I'd be prepared to do that, but I'd still like to consider the async case separately from the synchronous one.

    ncoghlan commented 7 years ago

    Nathaniel, actually, I think bpo-31387 is the right one to comment on, as still being able to interrupt long-running loops in synchronous code is a quality of implementation concern for that RFE.

    njsmith commented 7 years ago

    On another topic: you should add a test where the 'with' block exits by raising an exception, because in this case the interpreter jumps straight to WITH_CLEANUP_START, potentially skipping the DEFER_PENDING_UNTIL entirely.

    ncoghlan commented 7 years ago

    In addition to adding a test case to ensure exceptions were handled correctly, I also added test cases for return, break, and continue, and all four new tests initially failed.

    Adding a special case for WITH_CLEANUP_START resolved those, but the tests need an adjustment to account for the fact that some of the opcode offsets are unreachable in the test case that covers the early return.

    njsmith commented 7 years ago

    a special case for WITH_CLEANUP_START

    I guess this works for the sync case, but it seems kinda fragile, and obviously won't work for the async case. Do you think it's worth revisiting this idea from the OP?: "The eval loop's "check for signal handlers" code is run rarely, so we could afford to do relatively expensive things like check a lookaside table that says "no signal handlers when 13 \< f_lasti \<= 22"."

    (Of course it should be 13 \<= f_lasti \< 22, apparently I was writing in a mirror or something.)

    BTW, I looked at your branch, and figured out the problem with your async test. Asyncio isn't a problem (YIELDFROM doesn't yield to the event loop, only an actual 'yield' statement does that); the problem was just that your test case's \_aexit is written in Python, so you were exceptions delayed until the __aexit call, and then they were raised inside __aexit and the test couldn't detect that properly. I have a patch against 9b8891a7ca5 that makes it pass. (It switches the pass condition to "either tracking_cm.enter_withoutexit is False OR AsyncTrackingCM.\_aexit is in the exception's traceback".) But of course it still has the problem with jumping past DEFER_PENDING_UNTIL. Do you want me to post the patch here, or in the other issue?

    ncoghlan commented 7 years ago

    Yeah, I was already thinking the lookaside table might be a better idea (similar to co_lnotab replacing the old SET_LINENO opcodes), especially since that would also provide a quick way of introspecting code objects to see if they included any try/finally constructs. The special case was just a quick check to see if the problem was what I thought it was.

    That's also a good point regarding __aexit__ (I figured out that was the problem in the sync case, but hadn't realised it also applied to the async case), so I'll restore those tests and opcode updates.

    I'm also wondering how far we might be able to get by adjusting the pending signal processing such that we always execute at least one line from a function before checking for pending signals (that should be enough to protect the test's __aexit__, and it would also be enough to get inside a with statement that guarded a function body against signals).

    ncoghlan commented 7 years ago

    I updated the PR at https://github.com/ncoghlan/cpython/pull/2/files with another attempt at reinstating the asynchronous CM test case that also tweaks the eval loop to ensure that the first few opcodes in a function are always executed before pending calls are checked.

    This version highlights why I think asyncio really needs work before the async behaviour can be tested robustly in the standard library: it *hangs* somewhere in _asyncio.TaskStepMethWrapper once pending call handling is deferred long enough for the __aexit__ method to actually finish.

    njsmith commented 7 years ago

    Here's the patch I mentioned: https://github.com/njsmith/cpython/commit/62547dc5ea323a07c25c2047636a02241f518013

    It has a crude but effective technique for handling the hanging issue :-). Alternatively one could use the World's Stupidest Coroutine Runner:

    def run(coro):
        try:
            coro.send(None)
        except StopIteration:
            pass
        else:
            raise RuntimeError("yielded")

    I'm also wondering how far we might be able to get by adjusting the pending signal processing such that we always execute at least one line from a function before checking for pending signals

    Huh, that's a neat idea. I think it's defeated by 'while True: pass', though :-(. (You have to write it on two lines, but it compiles down to a single instruction that jumps to itself.)

    ncoghlan commented 7 years ago

    After chatting to Facebook's Carl Shapiro here at the core dev sprint, I'm going to start exploring a hopefully more robust white-listing based approach where we check for pending calls in the following cases:

    For now, I'm going to *skip* checking for additional pending calls when exiting a frame via exception, based on the theory that if we're already handling an exception, it's OK to wait until something catches it before checking for potential additional exception sources.

    ncoghlan commented 7 years ago

    I've pushed a variant that relies entirely on checking for instruction pointer changes, and it seems quite promising. However, a testio failure introduced by this variant shows a new way for this version to *create* problems: because of the shift in the way signals are handled, the failing test switches from raising the exception as the last item inside the with statement to instead raising it as the first thing inside the \_exit__ statement for a pure-Python context manager.

    ncoghlan commented 6 years ago

    With bpo-17611 merged (which moves stack unwinding to the compiler), I expect the exact details of this problem to have changed, but the general problem still exists: Ctrl-C may lead to __exit (or __aexit) not being called even after __enter (or __aenter) returns successfully, and this may happen even for context managers implemented with uninterruptible methods (e.g. in C in CPython without calling back into any Python code).

    serhiy-storchaka commented 6 years ago

    The issue with synchronous 'with' can be solved by bpo-32949.

    See also bpo-34066 for the problem with interruption before calling __enter__.

    ncoghlan commented 6 years ago

    While Greg Smith and I both cringed at the idea when I first raised it, I'm becoming more and more convinced that the only way we're going to be able to make resource cleanup reliable is for with statements to have the ability to disable signal handling while __enter and __exit methods are running.

    When a with statement switches signal handling off in a particular execution context, there'd then need to be some absolute timing deadline for switching them back on, so resource acquisition or cleanup that got stuck in an infinite loop could still be interrupted eventually.

    If you combined that with the signal handling approach in https://github.com/ncoghlan/cpython/pull/2/files, then I think we'd have as solid a solution as CPython is likely to be able to provide.

    rhettinger commented 5 years ago

    Historically, in C, Python, and and low-level interrupt programming it was the responsibility of a person writing a signal handler to make minimal variable edits and not disrupt anything else in the ecosystem.

    Are we changing that rule? (I'm still at the cringing stage and possibly haven't evolved my views yet).