grin-compiler / grin

GRIN is a compiler back-end for lazy and strict functional languages with whole program optimization support.
https://grin-compiler.github.io/
1.03k stars 38 forks source link

Fixed some major issues related to Dead Variable Elimination #23

Closed Anabra closed 5 years ago

Anabra commented 5 years ago

The problem

Dead Variable Elimination treated side-effecting computation incorrectly. Take a look at the following example:

x <- pure (CInt 5)
y <- case x of
  (CInt n) ->
    _prim_int_print 5
    pure 5
pure 0

The variable y is DEAD, because it is irrelevant in the context of the pure result of the program. However, its binding cannot be removed, because it contains a side-effecting computation. If we were to remove this, the semantics of the program would change.

The solution

Treat side-effecting computations in the transformation level using EffectMap. Now Dead Variable Elimination will perform a simple bottom-up analysis of the syntax tree, and collect so-called "protected" variables. These variables are protected, because some expression with a side-effecting computation is bound to them. Protected variables cannot be removed from the program.

Note that this analysis is local, it will not follow function calls.

Prerequisites

Pattern bindings need special treatment, because their left-hand side can contain any arbitrary expression. This makes it hard to "identify" a computation. To resolve this problem with minimal effort, we introduce a new convention: Each pattern binding with a non-variable pattern must have a simple left-hand side, that is they can only be of the form: pure <var>. Also, for convenience, all case scrutinees can only be variables as well. This ensures that every "real" computation has a name.

To introduce these conventions, we define a new transformation: BindingPatternSimplification.

Alternative solution

We could implement an interprocedural analysis which calculates the possible side-effecting computations for every function and binding. Currently, EffectMap only calculates this information for functions, which is not sufficient.

Failable patterns

Case expressions containing failable patterns can give rise to a very similar problem to side-effecting computations. See the code below.

x <- pure (CInt 5)
y <- case x of
  (CUnit) -> pure 5
pure 0

The variable y is DEAD considering the pure result of the program, but a failable case expression is bound to it. If we consider pattern match errors to be implicit (i.e.: the runtime would throw an error whenever it cannot match a value on any pattern), then this is a very similar problem to the one mentioned above. However, if pattern match errors are explicit (i.e.: there is always a #default alternative with an error throwing right-hand side), then the above piece of code has undefined semantics, and we do not have to consider it for any analysis or transformation.

This question requires further discussion.

coveralls commented 5 years ago

Coverage Status

Coverage increased (+0.4%) to 51.551% when pulling 8d72e0ea2e65d7d6454cb956f5db6b2adc9c0fd4 on Anabra:master into 5a674efc3ca7962d9ac06ef012a722522a3f1422 on grin-tech:master.