rocky / python-uncompyle6

A cross-version Python bytecode decompiler
GNU General Public License v3.0
3.74k stars 408 forks source link

`JUMP_FORWARD` decompilation error #411

Closed Berbe closed 1 year ago

Berbe commented 1 year ago

Description

# uncompyle6 version 3.9.0a1
# Python bytecode version base 2.7 (62211)
# Decompiled from: Python 3.9.2 (default, Feb 28 2021, 17:03:44)
# [GCC 10.2.1 20210110]
# Embedded file name: test.5.KO.py
# Compiled at: 2022-10-01 00:49:19

class Test:

    def test--- This code section failed: ---

 L.   3         0  SETUP_LOOP           47  'to 50'
                3  LOAD_FAST             1  'a'
                6  LOAD_ATTR             0  'values'
                9  CALL_FUNCTION_0       0  None
               12  GET_ITER
               13  FOR_ITER             33  'to 49'
               16  STORE_FAST            2  'b'

 L.   4        19  LOAD_FAST             2  'b'
               22  POP_JUMP_IF_FALSE    13  'to 13'

 L.   5        25  LOAD_FAST             2  'b'
               28  STORE_FAST            1  'a'
               31  JUMP_FORWARD          3  'to 37'

 L.   7        34  CONTINUE             13  'to 13'
             37_0  COME_FROM            31  '31'

 L.   8        37  LOAD_FAST             2  'b'
               40  POP_JUMP_IF_FALSE    13  'to 13'

 L.   9        43  CONTINUE             13  'to 13'
               46  JUMP_BACK            13  'to 13'
               49  POP_BLOCK
             50_0  COME_FROM             0  '0'

Parse error at or near `LOAD_FAST' instruction at offset 37

How to Reproduce

Code ```python class Test(): def test(self): for b in a.values(): if b: a = b else: continue if b: pass ```

Output Given

Output ```console -- Stacks of completed symbols: START ::= |- stmts . _come_froms ::= \e__come_froms . COME_FROM _ifstmts_jump ::= \e_c_stmts_opt . JUMP_FORWARD come_froms _ifstmts_jump ::= c_stmts_opt . JUMP_FORWARD come_froms _ifstmts_jump ::= c_stmts_opt JUMP_FORWARD . come_froms _stmts ::= _stmts . stmt _stmts ::= stmt . and ::= expr . JUMP_IF_FALSE_OR_POP expr COME_FROM and ::= expr . jmp_false expr \e_come_from_opt and ::= expr . jmp_false expr come_from_opt and ::= expr jmp_false . expr \e_come_from_opt and ::= expr jmp_false . expr come_from_opt and ::= expr jmp_false expr . come_from_opt and ::= expr jmp_false expr \e_come_from_opt . assert ::= assert_expr . jmp_true LOAD_ASSERT RAISE_VARARGS_1 assert2 ::= assert_expr . jmp_true LOAD_ASSERT expr CALL_FUNCTION_1 RAISE_VARARGS_1 assert_expr ::= assert_expr_and . assert_expr ::= expr . assert_expr_and ::= assert_expr . jmp_false expr assert_expr_and ::= assert_expr jmp_false . expr assert_expr_and ::= assert_expr jmp_false expr . assert_expr_or ::= assert_expr . jmp_true expr assign ::= expr . DUP_TOP designList assign ::= expr . store assign ::= expr store . assign2 ::= expr . expr ROT_TWO store store assign3 ::= expr . expr expr ROT_THREE ROT_TWO store store store attribute ::= expr . GET_ITER attribute ::= expr . LOAD_ATTR attribute ::= expr GET_ITER . attribute ::= expr LOAD_ATTR . aug_assign1 ::= expr . expr inplace_op ROT_FOUR STORE_SLICE+3 aug_assign1 ::= expr . expr inplace_op ROT_THREE STORE_SLICE+1 aug_assign1 ::= expr . expr inplace_op ROT_THREE STORE_SLICE+2 aug_assign1 ::= expr . expr inplace_op ROT_THREE STORE_SUBSCR aug_assign1 ::= expr . expr inplace_op ROT_TWO STORE_SLICE+0 aug_assign1 ::= expr . expr inplace_op store aug_assign2 ::= expr . DUP_TOP LOAD_ATTR expr inplace_op ROT_TWO STORE_ATTR bin_op ::= expr . expr binary_operator c_stmts ::= _stmts . c_stmts ::= _stmts . lastc_stmt c_stmts_opt ::= c_stmts . call ::= expr . CALL_FUNCTION_0 call ::= expr CALL_FUNCTION_0 . call_stmt ::= expr . POP_TOP classdefdeco1 ::= expr . classdefdeco1 CALL_FUNCTION_1 classdefdeco1 ::= expr . classdefdeco2 CALL_FUNCTION_1 come_froms ::= COME_FROM . come_froms ::= come_froms . COME_FROM compare_chained ::= expr . compare_chained1 ROT_TWO POP_TOP \e__come_froms compare_chained ::= expr . compare_chained1 ROT_TWO POP_TOP _come_froms compare_single ::= expr . expr COMPARE_OP continue ::= CONTINUE . continues ::= _stmts . lastl_stmt continue continues ::= continue . continues ::= lastl_stmt . continue delete ::= expr . DELETE_ATTR delete_subscript ::= expr . expr DELETE_SUBSCR else_suite ::= suite_stmts . expr ::= LOAD_FAST . expr ::= attribute . expr ::= call . expr ::= get_iter . expr_jitop ::= expr . JUMP_IF_TRUE_OR_POP expr_jt ::= expr . jmp_true for ::= SETUP_LOOP . expr for_iter store for_block POP_BLOCK \e__come_froms for ::= SETUP_LOOP . expr for_iter store for_block POP_BLOCK _come_froms for ::= SETUP_LOOP expr . for_iter store for_block POP_BLOCK \e__come_froms for ::= SETUP_LOOP expr . for_iter store for_block POP_BLOCK _come_froms for ::= SETUP_LOOP expr for_iter . store for_block POP_BLOCK \e__come_froms for ::= SETUP_LOOP expr for_iter . store for_block POP_BLOCK _come_froms for ::= SETUP_LOOP expr for_iter store . for_block POP_BLOCK \e__come_froms for ::= SETUP_LOOP expr for_iter store . for_block POP_BLOCK _come_froms for_block ::= \e_l_stmts_opt . JUMP_ABSOLUTE JUMP_BACK JUMP_BACK for_block ::= \e_l_stmts_opt . JUMP_BACK for_block ::= \e_l_stmts_opt . _come_froms JUMP_BACK for_block ::= \e_l_stmts_opt \e__come_froms . JUMP_BACK for_block ::= l_stmts_opt . JUMP_ABSOLUTE JUMP_BACK JUMP_BACK for_block ::= l_stmts_opt . JUMP_BACK for_block ::= l_stmts_opt . _come_froms JUMP_BACK for_block ::= l_stmts_opt \e__come_froms . JUMP_BACK for_iter ::= GET_ITER . COME_FROM FOR_ITER for_iter ::= GET_ITER . FOR_ITER for_iter ::= GET_ITER FOR_ITER . forelsestmt ::= SETUP_LOOP . expr for_iter store for_block POP_BLOCK else_suite \e__come_froms forelsestmt ::= SETUP_LOOP . expr for_iter store for_block POP_BLOCK else_suite _come_froms forelsestmt ::= SETUP_LOOP expr . for_iter store for_block POP_BLOCK else_suite \e__come_froms forelsestmt ::= SETUP_LOOP expr . for_iter store for_block POP_BLOCK else_suite _come_froms forelsestmt ::= SETUP_LOOP expr for_iter . store for_block POP_BLOCK else_suite \e__come_froms forelsestmt ::= SETUP_LOOP expr for_iter . store for_block POP_BLOCK else_suite _come_froms forelsestmt ::= SETUP_LOOP expr for_iter store . for_block POP_BLOCK else_suite \e__come_froms forelsestmt ::= SETUP_LOOP expr for_iter store . for_block POP_BLOCK else_suite _come_froms genexpr_func ::= LOAD_FAST . FOR_ITER store comp_iter JUMP_BACK get_iter ::= expr . GET_ITER get_iter ::= expr GET_ITER . if_exp ::= expr . jmp_false expr JUMP_ABSOLUTE expr if_exp ::= expr . jmp_false expr JUMP_FORWARD expr COME_FROM if_exp ::= expr jmp_false . expr JUMP_ABSOLUTE expr if_exp ::= expr jmp_false . expr JUMP_FORWARD expr COME_FROM if_exp ::= expr jmp_false expr . JUMP_ABSOLUTE expr if_exp ::= expr jmp_false expr . JUMP_FORWARD expr COME_FROM if_exp_lambda ::= expr . jmp_false expr return_if_lambda return_stmt_lambda LAMBDA_MARKER if_exp_lambda ::= expr jmp_false . expr return_if_lambda return_stmt_lambda LAMBDA_MARKER if_exp_lambda ::= expr jmp_false expr . return_if_lambda return_stmt_lambda LAMBDA_MARKER if_exp_not ::= expr . jmp_true expr _jump expr COME_FROM if_exp_not_lambda ::= expr . jmp_true expr return_if_lambda return_stmt_lambda LAMBDA_MARKER if_exp_true ::= expr . JUMP_FORWARD expr COME_FROM ifelsestmt ::= testexpr . c_stmts_opt JUMP_FORWARD else_suite come_froms ifelsestmt ::= testexpr \e_c_stmts_opt . JUMP_FORWARD else_suite come_froms ifelsestmt ::= testexpr c_stmts_opt . JUMP_FORWARD else_suite come_froms ifelsestmt ::= testexpr c_stmts_opt JUMP_FORWARD . else_suite come_froms ifelsestmt ::= testexpr c_stmts_opt JUMP_FORWARD else_suite . come_froms ifelsestmt ::= testexpr c_stmts_opt JUMP_FORWARD else_suite come_froms . ifelsestmtl ::= testexpr . c_stmts_opt CONTINUE else_suitel ifelsestmtl ::= testexpr . c_stmts_opt JUMP_BACK else_suitel ifelsestmtl ::= testexpr \e_c_stmts_opt . CONTINUE else_suitel ifelsestmtl ::= testexpr \e_c_stmts_opt . JUMP_BACK else_suitel ifelsestmtl ::= testexpr c_stmts_opt . CONTINUE else_suitel ifelsestmtl ::= testexpr c_stmts_opt . JUMP_BACK else_suitel ifelsestmtr ::= testexpr . return_if_stmts COME_FROM returns iflaststmtl ::= testexpr . c_stmts iflaststmtl ::= testexpr . c_stmts_opt JUMP_BACK iflaststmtl ::= testexpr \e_c_stmts_opt . JUMP_BACK iflaststmtl ::= testexpr c_stmts . iflaststmtl ::= testexpr c_stmts_opt . JUMP_BACK ifstmt ::= testexpr . _ifstmts_jump ifstmt ::= testexpr . return_if_stmts COME_FROM ifstmt ::= testexpr . return_stmts COME_FROM jmp_false ::= POP_JUMP_IF_FALSE . l_stmts ::= lastl_stmt . l_stmts_opt ::= l_stmts . lastl_stmt ::= iflaststmtl . mkfuncdeco ::= expr . mkfuncdeco CALL_FUNCTION_1 mkfuncdeco ::= expr . mkfuncdeco0 CALL_FUNCTION_1 print_items_nl_stmt ::= expr . PRINT_ITEM \e_print_items_opt PRINT_NEWLINE_CONT print_items_nl_stmt ::= expr . PRINT_ITEM print_items_opt PRINT_NEWLINE_CONT print_items_stmt ::= expr . PRINT_ITEM \e_print_items_opt print_items_stmt ::= expr . PRINT_ITEM print_items_opt print_nl_to ::= expr . PRINT_NEWLINE_TO print_to ::= expr . print_to_items POP_TOP print_to_nl ::= expr . print_to_items PRINT_NEWLINE_TO raise_stmt1 ::= expr . RAISE_VARARGS_1 raise_stmt2 ::= expr . expr RAISE_VARARGS_2 raise_stmt3 ::= expr . expr expr RAISE_VARARGS_3 ret_and ::= expr . JUMP_IF_FALSE_OR_POP return_expr_or_cond COME_FROM ret_or ::= expr . JUMP_IF_TRUE_OR_POP return_expr_or_cond COME_FROM return ::= return_expr . RETURN_VALUE return_expr ::= expr . return_expr_lambda ::= return_expr . RETURN_VALUE_LAMBDA return_expr_lambda ::= return_expr . RETURN_VALUE_LAMBDA LAMBDA_MARKER return_if_stmt ::= return_expr . RETURN_END_IF return_if_stmts ::= _stmts . return_if_stmt return_stmts ::= _stmts . return_stmt returns ::= _stmts . return slice0 ::= expr . DUP_TOP SLICE+0 slice0 ::= expr . SLICE+0 slice1 ::= expr . expr DUP_TOPX_2 SLICE+1 slice1 ::= expr . expr SLICE+1 slice2 ::= expr . expr DUP_TOPX_2 SLICE+2 slice2 ::= expr . expr SLICE+2 slice3 ::= expr . expr expr DUP_TOPX_3 SLICE+3 slice3 ::= expr . expr expr SLICE+3 stmt ::= assign . stmt ::= continue . store ::= STORE_FAST . subscript ::= expr . expr BINARY_SUBSCR subscript2 ::= expr . expr DUP_TOPX_2 BINARY_SUBSCR suite_stmts ::= _stmts . suite_stmts ::= continues . testexpr ::= testfalse . testfalse ::= expr . jmp_false testfalse ::= expr jmp_false . testtrue ::= expr . jmp_true unary_convert ::= expr . UNARY_CONVERT unary_not ::= expr . UNARY_NOT unary_op ::= expr . unary_operator while1elsestmt ::= SETUP_LOOP . l_stmts JUMP_BACK POP_BLOCK else_suitel COME_FROM while1elsestmt ::= SETUP_LOOP . l_stmts JUMP_BACK else_suitel COME_FROM while1stmt ::= SETUP_LOOP . l_stmts_opt CONTINUE COME_FROM while1stmt ::= SETUP_LOOP . l_stmts_opt JUMP_BACK COME_FROM while1stmt ::= SETUP_LOOP . l_stmts_opt JUMP_BACK POP_BLOCK COME_FROM while1stmt ::= SETUP_LOOP . returns COME_FROM while1stmt ::= SETUP_LOOP . returns pb_come_from while1stmt ::= SETUP_LOOP \e_l_stmts_opt . CONTINUE COME_FROM while1stmt ::= SETUP_LOOP \e_l_stmts_opt . JUMP_BACK COME_FROM while1stmt ::= SETUP_LOOP \e_l_stmts_opt . JUMP_BACK POP_BLOCK COME_FROM whileelsestmt ::= SETUP_LOOP . testexpr \e_l_stmts_opt JUMP_BACK POP_BLOCK else_suitel COME_FROM whileelsestmt ::= SETUP_LOOP . testexpr l_stmts_opt JUMP_BACK POP_BLOCK else_suitel COME_FROM whilestmt ::= SETUP_LOOP . testexpr \e_l_stmts_opt JUMP_BACK JUMP_BACK POP_BLOCK \e__come_froms whilestmt ::= SETUP_LOOP . testexpr \e_l_stmts_opt JUMP_BACK JUMP_BACK POP_BLOCK _come_froms whilestmt ::= SETUP_LOOP . testexpr \e_l_stmts_opt JUMP_BACK POP_BLOCK \e__come_froms whilestmt ::= SETUP_LOOP . testexpr \e_l_stmts_opt JUMP_BACK POP_BLOCK _come_froms whilestmt ::= SETUP_LOOP . testexpr l_stmts_opt JUMP_BACK JUMP_BACK POP_BLOCK \e__come_froms whilestmt ::= SETUP_LOOP . testexpr l_stmts_opt JUMP_BACK JUMP_BACK POP_BLOCK _come_froms whilestmt ::= SETUP_LOOP . testexpr l_stmts_opt JUMP_BACK POP_BLOCK \e__come_froms whilestmt ::= SETUP_LOOP . testexpr l_stmts_opt JUMP_BACK POP_BLOCK _come_froms whilestmt ::= SETUP_LOOP . testexpr returns \e__come_froms POP_BLOCK COME_FROM whilestmt ::= SETUP_LOOP . testexpr returns _come_froms POP_BLOCK COME_FROM with ::= expr . SETUP_WITH POP_TOP \e_suite_stmts_opt POP_BLOCK LOAD_CONST COME_FROM_WITH WITH_CLEANUP END_FINALLY with ::= expr . SETUP_WITH POP_TOP suite_stmts_opt POP_BLOCK LOAD_CONST COME_FROM_WITH WITH_CLEANUP END_FINALLY withasstmt ::= expr . SETUP_WITH store \e_suite_stmts_opt POP_BLOCK LOAD_CONST COME_FROM_WITH WITH_CLEANUP END_FINALLY withasstmt ::= expr . SETUP_WITH store suite_stmts_opt POP_BLOCK LOAD_CONST COME_FROM_WITH WITH_CLEANUP END_FINALLY yield ::= expr . YIELD_VALUE Instruction context: -> L. 8 37 LOAD_FAST 2 'b' 40 POP_JUMP_IF_FALSE 13 'to 13' # file test.5.KO.pyc # Deparsing stopped due to parse error test.5.KO.pyc -- # decompile failed ```

Additional information

If useful payload (here LOAD_FAST) is replaced with nothing (pass), the decompilation works. For instance, this code is fine:

Code ```python class Test(): def test(self): for b in a.values(): if b: pass # No payload else: continue if b: pass ```

which disassembles in:

Disassembly ``` 3: 0 SETUP_LOOP (to 44) 3 LOAD_GLOBAL (a) 6 LOAD_ATTR (values) 9 CALL_FUNCTION (0 positional, 0 named) 12 GET_ITER >> 13 FOR_ITER (to 43) 16 STORE_FAST (b) 4: 19 LOAD_FAST (b) 22 POP_JUMP_IF_FALSE (to 13) 5: 25 JUMP_FORWARD (to 31) 7: 28 JUMP_ABSOLUTE (to 13) 8: >> 31 LOAD_FAST (b) 34 POP_JUMP_IF_FALSE (to 13) 9: 37 JUMP_ABSOLUTE (to 13) 40 JUMP_ABSOLUTE (to 13) >> 43 POP_BLOCK >> 44 LOAD_CONST (None) 47 RETURN_VALUE ```
rocky commented 1 year ago

I believe a fix to this is similar to the approach used in #408 where a new parse rule for "for-body" is needed along with possibly an expanded reduce rule to ensure this grammar rule targets only "for-body" kinds of things.

Given this information, maybe you or someone else can try to come up with a solution?

Berbe commented 1 year ago

I'm willing to try that!

I first need to understand what's wrong for the decompiler there: Is it the CONTINUE instruction replacing JUMP_ABSOLUTE to continue the loop which wreaks havoc? I do not find any CONTINUE instruction in the docs, but rather a CONTINUE_LOOP

Berbe commented 1 year ago

Ah. I think I understand; correct me if I'm wrong.

In #408, you added a signature for for_loop having this signature:

for_block ::= \e_l_stmts_opt . JUMP_ABSOLUTE JUMP_BACK JUMP_BACK

What would be needed here is to address this kind of signature:

for_block ::= \e_l_stmts_opt . CONTINUE JUMP_BACK
rocky commented 1 year ago

Ah. I think I understand; correct me if I'm wrong.

In #408, you added a signature for for_loop having this signature:

for_block ::= \e_l_stmts_opt . JUMP_ABSOLUTE JUMP_BACK JUMP_BACK

Note that \e_ prefix is printed by the parser during execution. It indicates that in matching, this nonterminal matches the empty set of tokens. In other words, you won't find a rule that starts with \e_. Likewise the "." is something shown in tracing to show the match position and is not part of the rule either.

What would be needed here is to address this kind of signature:

for_block ::= \e_l_stmts_opt . CONTINUE JUMP_BACK

Yes, with the caveats mentioned above, I think so. But I haven't been following things too closely.

I do not find any CONTINUE instruction in the docs, but rather a CONTINUE_LOOP

(Re)read https://github.com/rocky/python-uncompyle6/wiki/How-does-this-code-work%3F#token-name-specialization and more generally how the decompiler works.

Berbe commented 1 year ago

I guess we're misled here, despite resemblance to #408.

Grammar display shows the following:

L.  4:  19-37  ifelsestmt ::= testexpr c_stmts_opt JUMP_FORWARD else_suite come_froms (14)
Reduce ifelsestmt invalid by check

matching:

 L.   4        19  LOAD_FAST             2  'b'
               22  POP_JUMP_IF_FALSE    13  'to 13'

 L.   5        25  LOAD_FAST             2  'b'
               28  STORE_FAST            1  'a'
               31  JUMP_FORWARD          3  'to 37'

 L.   7        34  CONTINUE             13  'to 13'
             37_0  COME_FROM            31  '31'

The problem is triggered on these lines : https://github.com/rocky/python-uncompyle6/blob/ae46ccf6d39c4ed832cb9913575e5396ad9bff1a/uncompyle6/parsers/reducecheck/ifelsestmt.py#L242-L243 Since CONTINUE refers to a location before the start of the if ... then ... else ... block, it is considered invalid. I reckon one would need to check being inside a loop... or just let CONTINUE instructions breaking out freely?

rocky commented 1 year ago

The nonterminals ifelsestmtl and ifelsestmtc should allow 'CONTINUE' while a plain ifelsesmt should not. So that test I guess needs to get made more complicated here.

Control flow handling is a big problem. Privately I have been slowing working on coming up with something that deals with this in a more precise way. (But that comes at a price - needing to understand even more concepts like dominator regions).

Berbe commented 1 year ago

IIUC, what you're saying is this bit of code:

            if b:
                a = b
            else:
                continue

shall have been recognised as ifelsestmtl or ifelsestmtc, but certainly not as ifelsestmt?

It seems the "offending" match comes from: https://github.com/rocky/python-uncompyle6/blob/2264ccb1d571ab75660271e196049d63aad8791f/uncompyle6/parsers/reducecheck/ifelsestmt.py#L65-L74


The thing that strikes me most is that the structure of instructions as disassembled makes little-to-no sense. This part is all that is required to implement the specified logic:

 L.   4        19  LOAD_FAST             2  'b'
               22  POP_JUMP_IF_FALSE    13  'to 13'

 L.   5        25  LOAD_FAST             2  'b'
               28  STORE_FAST            1  'a'

This part, however, looks like the last part of a traditional if-then-else structure, but serves no purpose:

               31  JUMP_FORWARD          3  'to 37'

 L.   7        34  CONTINUE             13  'to 13'
             37_0  COME_FROM            31  '31'

What do you think about returning False from ifelsestmt checks when a CONTINUE instruction matches the target of a POP_JUMP_IF_FALSE in such a sequence?

rocky commented 1 year ago

IIUC, what you're saying is this bit of code:

            if b:
                a = b
            else:
                continue

shall have been recognised as ifelsestmtl or ifelsestmtc, but certainly not as ifelsestmt?

Yes, that is correct.

pyle6/blob/2264ccb1d571ab75660271e196049d63aad8791f/uncompyle6/parsers/reducecheck/ifelsestmt.py#L65-L74

The thing that strikes me most is that the structure of instructions as disassembled makes little-to-no sense. This part is all that is required to implement the specified logic:

 L.   4        19  LOAD_FAST             2  'b'
               22  POP_JUMP_IF_FALSE    13  'to 13'

 L.   5        25  LOAD_FAST             2  'b'
               28  STORE_FAST            1  'a'

This part, however, looks like the last part of a traditional if-then-else structure, but serves no purpose:

               31  JUMP_FORWARD          3  'to 37'

 L.   7        34  CONTINUE             13  'to 13'
             37_0  COME_FROM            31  '31'

The folks who wrote the code generation in Python 2.7 aren't interested in your opinion of how the code would best be written. Nor is that going to change things either. From the decompilation side the only thing that matters is whether the code as it is produced matches the semantics of particular Python code.

In certain cases, especially among older code generation, there was erroneous token classification in the decompiler. For example if the token name were "BREAK out of loop" that would be wrong since the jumps is to the beginning of a loop. Fortunately that is not the situation above.

What do you think about returning False from ifelsestmt checks when a CONTINUE instruction matches the target of a POP_JUMP_IF_FALSE in such a sequence?

My experience is that you need to be as precise as possible. Vagueness just fixes one problem but breaks another. "continue" is valid only inside ifelsestmt{c,l} constructs.

Berbe commented 1 year ago

The folks who wrote the code generation in Python 2.7 aren't interested in your opinion of how the code would best be written. Nor is that going to change things either.

How surprising. 😏

My point was to merely seperate was being used, and what was useless, as in "has no use"/"cannot be used"/"is never used". The 3 instructions fron the second block I highlighted is dead, unreachable code. I have no knowledge of the inner workings of the compiler having been used, but my thought was: Wasn't there a problem with the compiler in the first place?

What do you think about returning False from ifelsestmt checks when a CONTINUE instruction matches the target of a POP_JUMP_IF_FALSE in such a sequence?

My experience is that you need to be as precise as possible. Vagueness just fixes one problem but breaks another. "continue" is valid only inside ifelsestmt{c,l} constructs.

Without any doubt, but the question rather was: Is it the correct location & way of addressing the problem at hand?

rocky commented 1 year ago

The folks who wrote the code generation in Python 2.7 aren't interested in your opinion of how the code would best be written. Nor is that going to change things either.

How surprising. smirk

My point was to merely seperate was being used, and what was useless, as in "has no use"/"cannot be used"/"is never used". The 3 instructions fron the second block I highlighted is dead, unreachable code. I have no knowledge of the inner workings of the compiler having been used, but my thought was: Wasn't there a problem with the compiler in the first place?

If you think of it from the perspective of how Python bytecode compiler works, things make more sense. There may be some initial instructions produced from a template and then later peephole optimization passes can modify these. Remnants of the original code might still be around, like dead pieces of DNA we have in our genes. Just as in genetics, in some cases a pattern-matching decompiler can use these useless DNA fragments to decide which of two equivalent Python expressions was originally used. But sometimes, as here, it is just dead code.

My understanding of code generation looking at it purely from the bytecode end, is that removing dead Python bytecode hasn't been much of a concern. There is be fair bit of dead code in persisting over many Python releases.

What do you think about returning False from ifelsestmt checks when a CONTINUE instruction matches the target of a POP_JUMP_IF_FALSE in such a sequence?

My experience is that you need to be as precise as possible. Vagueness just fixes one problem but breaks another. "continue" is valid only inside ifelsestmt{c,l} constructs.

Without any doubt, but the question rather was: Is it the correct location & way of addressing the problem at hand?

Parse rules tend to be overly broad and they have to be because currently in the tokens there is very limited information. This is true especially when it comes to control flow. In particular, target offsets are not part of tokens, so they not matched on. Furthermore it would be inconceivable to do this.

However, there is a little bit of this kind of classification by indicating that there is a loop jump (which is always a backward jump in the current way that Python produces bytecode), versus a forward jump. This is nowhere near enough information though.

So the reduction tests are charged with determining whether the control-flow rule is plausible based on where the jumps go to.

In human language what we want to test say is a backward or loop jump from the end of an "else" clause before the beginning of the "if" test is okay only if we are trying to match or reduce an "if" statement of the kind that can occur only inside a loop construct.

That part is necessary. Will that be sufficient? I don't know.

If all of this sounds a little vague, well, yes, that is the nature of this problem, short of understanding Python's code generator in detail and all of the implications it has. (And to be honest, I suspect the programmers who wrote the code generation did this with a bit of trial and error as well).

Berbe commented 1 year ago

Well, here is an attempt: #413 I made it as precise as possible, maybe not reusing all the variables helping understanding, but at least I dug as far as possible.

rocky commented 1 year ago

413 addresses this - closing.