Open laike9m opened 3 years ago
Since offset are no longer fixed, and can not represent loop start/end, as a prerequisite, we need to first switch to using lineno to represent loop start/end.
Steps:
At this point it seems we don't even need to pass offset to the frontend cause we only need lineno anyway. Removing offset deserve its own issue.
A big difference in using lineno and offset to index events is that lineno -> events are 1 to many mapping, while offset -> event is 1:1. visibleEvents
should be lineno -> [events]
It turns out that even with lineno, there's a problem with 3.10
In the following case, line 11 is supposed to be the end line of this loop. However, since the starting part is appended at the end of the bytecode of the loop, the loop ends at lineno 9.
>>> def test_while_loop(tracer, check_golden_file):
...
...
...
...
... tracer.start()
...
... i = 0
... while i < 2:
... a = i
... i += 1
>>> from dis import dis
>>> dis(test_while_loop)
6 0 LOAD_FAST 0 (tracer)
2 LOAD_METHOD 0 (start)
4 CALL_METHOD 0
6 POP_TOP
8 8 LOAD_CONST 1 (0)
10 STORE_FAST 2 (i)
9 12 LOAD_FAST 2 (i)
14 LOAD_CONST 2 (2)
16 COMPARE_OP 0 (<)
18 POP_JUMP_IF_FALSE 22 (to 44)
10 >> 20 LOAD_FAST 2 (i)
22 STORE_FAST 3 (a)
11 24 LOAD_FAST 2 (i)
26 LOAD_CONST 3 (1)
28 INPLACE_ADD
30 STORE_FAST 2 (i)
9 32 LOAD_FAST 2 (i)
34 LOAD_CONST 2 (2)
36 COMPARE_OP 0 (<)
38 POP_JUMP_IF_TRUE 10 (to 20)
40 LOAD_CONST 0 (None)
42 RETURN_VALUE
>> 44 LOAD_CONST 0 (None)
46 RETURN_VALUE
The easiest solution is to create an increasing mapping from offset to lineno. In this case, the original self.offset_to_lineno
is:
{0: 6, 8: 8, 12: 9, 20: 10, 24: 11, 32: 9, 40: 13, ...}
And self.increasing_offset_to_lineno
would be
{0: 6, 8: 8, 12: 9, 20: 10, 24: 11, 32: 11, 40: 13, ...}
This is fixed in https://github.com/laike9m/Cyberbrain/commit/fad4dca044835768e91cecc8a1f60a8b36d08810
TL;DR: In most case, loops in Cyberbrain still work for Python 3.10. This only issue is with while + walrus operator (see https://github.com/laike9m/Cyberbrain/issues/139#issuecomment-1003817059). Fixing the problem requires emitting JumpBackToLoopStart
at the right time on Python side, which we're currently not sure how, see https://github.com/laike9m/Cyberbrain/issues/139#issuecomment-1003912564.
For for
loops, the old implementation still works, because bytecode doesn't change much
https://godbolt.org/z/crKTsf4b4
There is an issue with the lineno of return
stmt in 3.10. But we can deal with this later (after upgrading to 3.10.1 and see if this is fixed)
For while
loops, like the previous example, it also just works. Even though the bytecode for while i < 2
is put at the end of the loop, it does not emit any event. So the JumpBackToLoopStart
event still follows the previous event, which is binding of i
. There's no change to the event stream.
There is a special case with walrus operator
from cyberbrain import trace
@trace
def run_while():
i = 0
while (i:= i+1) < 3:
a = i
run_while()
https://godbolt.org/z/sx1d8595e
Now, the extra bytecode at the end emits a binding event for i
, and the generated graphs are different:
3.9 (correct)
3.10 (wrong)
In the 3.10 graph, the second binding of i
(1 -> 2) is not supposed to be on the first iteration.
The solution is to
JumpBackToLoopStart
out of the value stack, to logger or jump detector_return_jump_back_event_if_exists
so that it checks lineno and not offset, so that the latter POP_JUMP_IF_TRUE
instr doesn't emit a JumpBackToLoopStart
(because it jumps from line 6 to line 7)Tried to use the decrease of lineno to determine whether there's a jump back. Turns out this is not reliable either. A counter example would be:
>>> import argparse
...
... def test_telephone_inner():
... args = argparse.Namespace(
... mutations=0.6, seed=2, text="The quick brown fox
... jumps over the lazy dog."
... )
>>> from dis import dis
>>> dis(test_telephone_inner)
4 0 LOAD_GLOBAL 0 (argparse)
2 LOAD_ATTR 1 (Namespace)
5 4 LOAD_CONST 1 (0.6)
6 LOAD_CONST 2 (2)
8 LOAD_CONST 3 ('The quick brown fox jumps over the lazy dog.')
4 10 LOAD_CONST 4 (('mutations', 'seed', 'text'))
12 CALL_FUNCTION_KW 3
14 STORE_FAST 0 (args)
16 LOAD_CONST 0 (None)
18 RETURN_VALUE
If we only count on lineno, then we would assume there's a JumpBackToLoopStart
from line 5 to 4, but in fact, this is only because arguments are evaluated first.
I expect there are more edge cases with lineno. Given that the loop issue only affects rare cases like the walrus operator, and that the loop UI can still work (but differently from previous versions), I decide to deprioritize fixing the issue. We should still add a test case for the walrus operator.
Python 3.10 removes the jump by duplicating some instructions at the loop's start. This way, there's only 1 jump when the loop exits, instead of 2 (to head then out). And for every loop iteration, there's one less instruction to run, which is JUMP_ABSOLUTE