python / cpython

The Python programming language
https://www.python.org
Other
62.43k stars 29.97k forks source link

Certain sys.monitoring "not taken" branches in a `for` loop not showing correctly #123050

Open jaltmayerpizzorno opened 1 month ago

jaltmayerpizzorno commented 1 month ago

Bug report

Bug description:

With the code from PR 122564, the destinations of the "not taken" branches out of lines 2 and 3 show as going out of scope, rather than the correct line 6. Additionally, the ("taken") branch into if's block also shows as going out of scope:

def foo():
    for v in [1, 2]:
        if v > 0:
            x += v

    return x

Results:

  1          0       RESUME                   0

  2          2       LOAD_CONST               1 ((1, 2))
             4       GET_ITER
      L1:    6       FOR_ITER                18 (to L3)
            10       NOT_TAKEN
            12       STORE_FAST               0 (v)

  3         14       LOAD_FAST                0 (v)
            16       LOAD_CONST               2 (0)
            18       COMPARE_OP             148 (bool(>))
            22       POP_JUMP_IF_TRUE         2 (to L2)
            26       JUMP_BACKWARD           12 (to L1)
      L2:   30       NOT_TAKEN

  4         32       LOAD_FAST_CHECK          1 (x)
            34       LOAD_FAST                0 (v)
            36       BINARY_OP               13 (+=)
            40       STORE_FAST               1 (x)
            42       JUMP_BACKWARD           20 (to L1)

  2   L3:   46       END_FOR
            48       POP_TOP

  6         50       LOAD_FAST_CHECK          1 (x)
            52       RETURN_VALUE
'foo' branches:
    ex10.py 2:13-2:19 "[1, 2]" (foo@6) -> 2:8-2:9 "v" (foo@12) [for -> block]
    ex10.py 2:13-2:19 "[1, 2]" (foo@6) -> 2:13-2:19 "[1, 2]" (foo@46) [for -> out]
    ex10.py 3:11-3:16 "v > 0" (foo@22) -> 3:11-3:16 "v > 0" (foo@28) [if/while -> out]
    ex10.py 3:11-3:16 "v > 0" (foo@22) -> 3:11-3:16 "v > 0" (foo@30) [if/while -> out]

@markshannon @nedbat

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

markshannon commented 1 month ago

There's a few things going on here. First: The jump in if v > 0: is miscompiled to

            22       POP_JUMP_IF_TRUE         2 (to L2)
            26       JUMP_BACKWARD           12 (to L1)
      L2:   30       NOT_TAKEN

Conceptually the this is a single instruction: POP_JUMP_IF_TRUE_BACKWARD L1. We only have one backward jump, JUMP_BACKWARD, so the assembler converts POP_JUMP_IF_TRUE_BACKWARD label to

    POP_JUMP_IF_TRUE over
    JUMP_BACKWARD label
over:

We need to account for NOT_TAKEN here (which probably means emitting NOT_TAKEN in the assembler). Even then, the location of the instruction following NOT_TAKEN is likely not the location of label, but that of POP_JUMP_IF_TRUE.

We don't want to prevent the compiler from generating efficient code, so we needs a means to handle these complex cases, so we'll need to handle this sys.monitoring somehow.

Since this isn't bug until we add the new branch events, I'm marking this as a feature request.

jaltmayerpizzorno commented 4 days ago

Updating the disassembly for the current code in pull/122564; the branches around the if are still unclear.

   1          0       RESUME                   0

   2          2       LOAD_CONST               1 (0)
              4       STORE_FAST               0 (x)

   3          6       LOAD_CONST               2 ((1, 2))
              8       GET_ITER
       L1:   10       FOR_ITER                18 (to L3)
             14       NOT_TAKEN
             16       STORE_FAST               1 (v)

   4         18       LOAD_FAST                1 (v)
             20       LOAD_CONST               1 (0)
             22       COMPARE_OP             148 (bool(>))
             26       POP_JUMP_IF_TRUE         3 (to L2)
             30       NOT_TAKEN
             32       JUMP_BACKWARD           13 (to L1)

  --   L2:   36       NOP

   5         38       LOAD_FAST_LOAD_FAST      1 (x, v)
             40       BINARY_OP               13 (+=)
             44       STORE_FAST               0 (x)
             46       JUMP_BACKWARD           20 (to L1)

   3   L3:   50       END_FOR
             52       POP_TOP

   7         54       LOAD_FAST                0 (x)
             56       RETURN_VALUE

'foo' branches:
    ex10.py 3:13-3:19 "[1, 2]" (foo@10) -> 3:8-3:9 "v" (foo@16) [for -> block]
    ex10.py 3:13-3:19 "[1, 2]" (foo@10) -> 3:13-3:19 "[1, 2]" (foo@50) [for -> out]
    ex10.py 4:11-4:16 "v > 0" (foo@26) -> 4:11-4:16 "v > 0" (foo@32) [if/while -> out]
    ex10.py 4:11-4:16 "v > 0" (foo@26) -> (start_line=None) (foo@36) [(dst.start_line=None)]