laike9m / Cyberbrain

Python debugging, redefined.
http://bit.ly/cyberbrain-features
MIT License
2.51k stars 159 forks source link

Make loops work in Python 3.10 #139

Open laike9m opened 3 years ago

laike9m commented 3 years ago

Python 3.10 removes the jump by duplicating some instructions at the loop's start. This way, there's only 1 jump when the loop exits, instead of 2 (to head then out). And for every loop iteration, there's one less instruction to run, which is JUMP_ABSOLUTE

image

from cyberbrain import trace

@trace
def run_while():
    i = 0
    while i < 2:
        a = i
        i += 1

run_while()
laike9m commented 3 years ago

Since offset are no longer fixed, and can not represent loop start/end, as a prerequisite, we need to first switch to using lineno to represent loop start/end.

Steps:

At this point it seems we don't even need to pass offset to the frontend cause we only need lineno anyway. Removing offset deserve its own issue.

A big difference in using lineno and offset to index events is that lineno -> events are 1 to many mapping, while offset -> event is 1:1. visibleEvents should be lineno -> [events]

laike9m commented 2 years ago

It turns out that even with lineno, there's a problem with 3.10

In the following case, line 11 is supposed to be the end line of this loop. However, since the starting part is appended at the end of the bytecode of the loop, the loop ends at lineno 9.

>>> def test_while_loop(tracer, check_golden_file):
...
...
...
...
...     tracer.start()
...
...     i = 0
...     while i < 2:
...         a = i
...         i += 1
>>> from dis import dis
>>> dis(test_while_loop)
  6           0 LOAD_FAST                0 (tracer)
              2 LOAD_METHOD              0 (start)
              4 CALL_METHOD              0
              6 POP_TOP

  8           8 LOAD_CONST               1 (0)
             10 STORE_FAST               2 (i)

  9          12 LOAD_FAST                2 (i)
             14 LOAD_CONST               2 (2)
             16 COMPARE_OP               0 (<)
             18 POP_JUMP_IF_FALSE       22 (to 44)

 10     >>   20 LOAD_FAST                2 (i)
             22 STORE_FAST               3 (a)

 11          24 LOAD_FAST                2 (i)
             26 LOAD_CONST               3 (1)
             28 INPLACE_ADD
             30 STORE_FAST               2 (i)

  9          32 LOAD_FAST                2 (i)
             34 LOAD_CONST               2 (2)
             36 COMPARE_OP               0 (<)
             38 POP_JUMP_IF_TRUE        10 (to 20)
             40 LOAD_CONST               0 (None)
             42 RETURN_VALUE
        >>   44 LOAD_CONST               0 (None)
             46 RETURN_VALUE

The easiest solution is to create an increasing mapping from offset to lineno. In this case, the original self.offset_to_lineno is:

{0: 6, 8: 8, 12: 9, 20: 10, 24: 11, 32: 9, 40: 13, ...}

And self.increasing_offset_to_lineno would be

{0: 6, 8: 8, 12: 9, 20: 10, 24: 11, 32: 11, 40: 13, ...}

This is fixed in https://github.com/laike9m/Cyberbrain/commit/fad4dca044835768e91cecc8a1f60a8b36d08810

laike9m commented 2 years ago

Important: recaping the problem

TL;DR: In most case, loops in Cyberbrain still work for Python 3.10. This only issue is with while + walrus operator (see https://github.com/laike9m/Cyberbrain/issues/139#issuecomment-1003817059). Fixing the problem requires emitting JumpBackToLoopStart at the right time on Python side, which we're currently not sure how, see https://github.com/laike9m/Cyberbrain/issues/139#issuecomment-1003912564.

Details

For for loops, the old implementation still works, because bytecode doesn't change much

https://godbolt.org/z/crKTsf4b4

There is an issue with the lineno of return stmt in 3.10. But we can deal with this later (after upgrading to 3.10.1 and see if this is fixed)

For while loops, like the previous example, it also just works. Even though the bytecode for while i < 2 is put at the end of the loop, it does not emit any event. So the JumpBackToLoopStart event still follows the previous event, which is binding of i. There's no change to the event stream.

laike9m commented 2 years ago

There is a special case with walrus operator

from cyberbrain import trace

@trace
def run_while():
    i = 0
    while (i:= i+1) < 3:
        a = i

run_while()

https://godbolt.org/z/sx1d8595e

Now, the extra bytecode at the end emits a binding event for i, and the generated graphs are different:

3.9 (correct) image

3.10 (wrong) image

In the 3.10 graph, the second binding of i (1 -> 2) is not supposed to be on the first iteration.

The solution is to

  1. Move the production of JumpBackToLoopStart out of the value stack, to logger or jump detector
  2. Modify _return_jump_back_event_if_exists so that it checks lineno and not offset, so that the latter POP_JUMP_IF_TRUE instr doesn't emit a JumpBackToLoopStart (because it jumps from line 6 to line 7)
laike9m commented 2 years ago

Tried to use the decrease of lineno to determine whether there's a jump back. Turns out this is not reliable either. A counter example would be:

>>> import argparse
...
... def test_telephone_inner():
...     args = argparse.Namespace(
...         mutations=0.6, seed=2, text="The quick brown fox
... jumps over the lazy dog."
...     )
>>> from dis import dis
>>> dis(test_telephone_inner)
  4           0 LOAD_GLOBAL              0 (argparse)
              2 LOAD_ATTR                1 (Namespace)

  5           4 LOAD_CONST               1 (0.6)
              6 LOAD_CONST               2 (2)
              8 LOAD_CONST               3 ('The quick brown fox jumps over the lazy dog.')

  4          10 LOAD_CONST               4 (('mutations', 'seed', 'text'))
             12 CALL_FUNCTION_KW         3
             14 STORE_FAST               0 (args)
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

If we only count on lineno, then we would assume there's a JumpBackToLoopStart from line 5 to 4, but in fact, this is only because arguments are evaluated first.

I expect there are more edge cases with lineno. Given that the loop issue only affects rare cases like the walrus operator, and that the loop UI can still work (but differently from previous versions), I decide to deprioritize fixing the issue. We should still add a test case for the walrus operator.