Python compiles dead code

GoogleCodeExporter commented 8 years ago

CPython and Unladen Swallow both currently emit bytecode/machine code for
code following a return, raise, break or continue statement, like so:

def foo():
  return 5
  for i in range(4):
    print i

Eliminating this code isn't critical (it isn't very common, and LLVM is
smart enough to eliminate it for us), but the need to support this has
spread into the LLVM compiler, which has to allow for this with special
deadcode blocks. This is ugly and should be fixed.

This could be done in a couple of ways: the dead code could be pruned from
the AST (or not even make it into the AST), or could be ignored by the
bytecode compiler.

Original issue reported on code.google.com by collinw on 13 May 2009 at 10:40

GoogleCodeExporter commented 8 years ago

The way clang's IR generation handles this particular problem is to use a null 
insert
block in the IRBuilder as a signal that the current code generation point is
unreachable. A few key places can then take advantage of this information to not
generate code if the current insert point is null, while the majority of code
generation uses a wrapper function which will lazily create the dummy block if 
needed.

This turns out to be simple and non-invasive, and means you get more dead code
elimination for free as other parts of the code generator become smarter.

Original comment by daniel.d...@gmail.com on 14 May 2009 at 4:21

GoogleCodeExporter commented 8 years ago

The peephole optimizer is working against us here.  Suppose you start with this 
function:

def implies(x, y):
    if x != 0:
        if y != 0:
            return 1
            x = 42
    else:
        return 1
    return 0

The "x = 42" line is unreachable so we eliminate that, compile the resulting 
function
to bytecode and end up with:

>>> dis.dis(implies)
  2           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               1 (0)
              6 COMPARE_OP               3 (!=)
              9 POP_JUMP_IF_FALSE       31

  3          12 LOAD_FAST                1 (y)
             15 LOAD_CONST               1 (0)
             18 COMPARE_OP               3 (!=)
             21 POP_JUMP_IF_FALSE       28

  4          24 LOAD_CONST               2 (1)
             27 RETURN_VALUE
        >>   28 JUMP_FORWARD             4 (to 35)

  7     >>   31 LOAD_CONST               2 (1)
             34 RETURN_VALUE

  8     >>   35 LOAD_CONST               1 (0)
             38 RETURN_VALUE

This has no unreachable instructions.  Then the peephole optimizer runs and 
sees that
"21 POP_JUMP_IF_FALSE 28" targets the jump "28 JUMP_FORWARD 4 (to 35)" and so
rewrites it as "21 POP_JUMP_IF_FALSE 35".  Now 28 is unreachable but is not 
eligible
to be removed since the optimizer still thinks it is a jump target.

I don't have any good ideas about how to solve this without a speed or memory 
penalty
in the peephole optimizer.

Original comment by abbeyj on 15 May 2009 at 11:55

GoogleCodeExporter commented 8 years ago

Original comment by collinw on 27 May 2009 at 9:44

Added labels: Performance

GoogleCodeExporter commented 8 years ago

I would like to work on a patch if there is no one else currently working on 
this issue.

Original comment by mcquilla...@gmail.com on 28 May 2009 at 8:42

GoogleCodeExporter commented 8 years ago

I've been working on a patch for this: http://codereview.appspot.com/63145

Original comment by abbeyj on 28 May 2009 at 6:13

GoogleCodeExporter commented 8 years ago

Original comment by collinw on 30 May 2009 at 6:34

Changed state: Started

GoogleCodeExporter commented 8 years ago

The LLVM compiler does a good job of DCE, so this is too low priority to bother 
with 
(especially since CPython doesn't really want this feature).

Original comment by collinw on 19 Feb 2010 at 9:43

Changed state: ForCPython

arvindm95 / unladen-swallow

Python compiles dead code #26