Partial evaluation for const-eval optimization

anton-trunov commented 1 week ago

As of now, the constant evaluator's usage is two-fold:

evaluate compile-time constants' initializers to their values and stop compilation if this cannot be done;
as an optimization technique: try to evaluate expressions to their values to use the values to produce target code.

The type of the constant evaluator is Expression -> Value.

As an optimization technique, the simple interpreter we use is quite limited, e.g. 2 * 3 * 7 * x where x is an arbitrary variable won't be simplified to 42 * x because evaluation gets stuck on variables. Instead we can try to produce a simplified AST, so we need a partial evaluator of type Expression -> Expression where the return expression should have the same semantics but is (non-strictly) smaller.

We can even apply some algebraic transformations, so that even x * 2 * 3 * 7 (which gets parsed as ((x * 2) * 3 * )7) gets transformed to x * 42 or 42 * x. However, this should be done with caution, because in the presence of overflows some arithmetic operations, for instance +, are not associative.

jeshecdom commented 1 week ago

Is it OK if potential overflows are removed from the expression? For example, consider this very silly example:

const foo: Int =  x + MAX - MAX

This may cause an overflow if interpreted as (x + MAX) - MAX. But let us suppose that the evaluator cannot evaluate x and decides to associate the expression as x + (MAX - MAX). Then, the overflow goes away and foo = x. The final value in foo is not semantically equivalent to the value of foo in the original program (the original program may overflow).

anton-trunov commented 1 week ago

I think we could do this, but it should be an opt-in optimization strategy, so by default we should preserve the semantics fully and if the user wants those kinds of more aggressive optimizations that only preserve the semantics morally, they can specify it in the project settings

jeshecdom commented 1 week ago

The algorithm will apply several rewriting rules in sequence. These rules ensure that the semantics of the expression are preserved. In particular, for integer expressions, the semantics are preserved if the following holds, where O denotes the original expression and T the transformed expression:

(O overflows) IF AND ONLY IF (T overflows)
and
IF (O does not overflow) THEN val(O) = val(T)

For non-integer expressions, we must have:

val(O) = val(T)

A more aggressive optimizer weakens the biconditional to a simple implication:

IF (O does not overflow) THEN (T does not overflow)
and
IF (O does not overflow) THEN val(O) = val(T)

In other words, if the original expression overflows, the optimizer is free to decide if T should overflow nor not.

As a first version, I will focus on optimizers that preserve the semantics (i.e., the condition with the biconditional).

The Algorithm

The idea is that given an AST expression (which I will call a tree in what follows), apply to the tree a sequence of transformations rules, and return the resulting tree. If a rule does not apply to a tree, the rule returns the tree unchanged. The top level invocation should check if the resulting tree is a constant node and extract the value, otherwise, it returns the simplified tree as is. A skeleton pseudocode of the algorithm is as follows:

simplify(ASTExpression tree): ASTExpression {
   switch (tree):
   ..... (other cases)
   case int_bin_op exp1 exp2 => 
      lTree = simplify exp1
      rTree = simplify exp2
      if (lTree and rTree are constant nodes) {
          lvalue = extract_val(lTree)
          rvalue = extract_val(rTree)
          result = get_op(int_bin_op)(lvalue, rvalue)
          if (result within int bounds) {
              return make_constant(result)
          } else {
              throw Error(Overflow)
          }
      } else {
            result = make_expression(bin_op lTree rTree)
          return apply_rules(result)
      }
   ..... (other cases)
}

apply_rules(ASTExpression tree): ASTExpression {
   ASTEXpression result = tree
   for each (rule in rules): // rules is a preloaded list of rules
       // If the rule does not apply to the given tree, the rule returns the tree unchanged
       tree = rule.apply_rule(tree) 
   return tree
}

Note that the algorithm is linear in the size of the tree. This means that it cannot be an optimal optimizer, because an optimal optimizer would take an exponential time in the worst-case, because the problem includes boolean expression minimization. Since it is not an optimal optimizer, this means that the optimizer will not reduce certain expressions to their minimal form, but I think the rules cover the most frequent use cases to be practical. We could keep adding more rules if in the future we find other strategies to simplify further expressions.

Transformation rules

Although I will write the transformations as expressions, bear in mind that they represent trees. I will use x (with indexed variations) to represent non-constant expressions, c (with indexed variations) to represent constant expressions, and op (with indexed variations) to represent binary operators.

Rule 1

The objective of the rule is to agglutinate the constants for immediate evaluation and push them outside of the expression, in hope that later rules will consume them in some other form.

(x1 op1 c1) op (x2 op2 c2)
----->
(x1 op1 x2) op (c1 op2 c2)

Conditions:
op1 and op associate, i.e., (a op1 b) op c = a op1 (b op c)
op and op2 aasociate, i.e., (a op b) op2 c = a op (b op2 c)
op commutes, i.e., a op b = b op a
op1 cannot be an integer operator: +,-,*,/ 
(I still need to check if this could work with modulo and shift operators)

The reason for the last condition: op1 cannot be an integer operator, has to do with overflow preservation. Without this condition, the rule does not preserve overflows when applied to integers, but it works for other types, like booleans.

To give an example, consider op = op1 = op2 = + and suppose c1 = 1 and c2 = 2. Since x1 and x2 are unknown (and potentially irreducible because they could include symbols that cannot be evaluated), it could happen that during execution, x1 = MAX_INT and x2 = -3. Hence, in the original expression, (x1 op1 c1) = MAX_INT + 1 causes an overflow during execution, but the transformed expression does not cause an overflow: (x1 op1 x2) op (c1 op2 c2) = (MAX_INT + -3) + (1 + 2) = (MAX_INT - 3) + 3 = MAX_INT.

As another example, consider op = op1 = op2 = * and suppose c1 = 1 and c2 = 2. Suppose that during execution, x1 = 0 and x2 = MAX_INT. Then, in the original expression, (x2 op2 c2) = MAX_INT * 2 causes an overflow during execution, but the transformed expression does not cause an overflow: (x1 op1 x2) op (c1 op2 c2) = (0 * MAX_INT) * (1 * 2) = 0 * 2 = 0.

Implementation of the rule. Keep an "associativity" table or map that will tell you if two operators associate. Also, another map to tell you if an operator commutes. Given the input tree, destruct the tree so that its form conforms to the left-hand-side of the rule: (x1 op1 c1) op (x2 op2 c2), i.e., check a top level binary operator, with two binary operators as children, with two constant nodes and two non-constant nodes as indicated in the left-hand-side.
Other variants of the rule: Any commutation of the operands in the left-hand-side: (x1 op1 c1) op (x2 op2 c2).

Rule 2

The objective is to push the constant outside of the expression, in hope that later rules will consume it in some other form.

(x1 op1 c1) op x2
----->
(x1 op1 x2) op c1

Conditions:
op1 and op associate, i.e., (a op1 b) op c = a op1 (b op c)
op commutes, i.e., a op b = b op a
op1 cannot be an integer operator: +,-,*,/ 
(I still need to check if this could work with modulo and shift operators)

Again, the reason for the last condition: op1 cannot be an integer operator, has to do with overflow preservation. Without this condition, the rule does not preserve overflows when applied to integers, but it works for other types, like booleans.

To give an example, consider op = op1 = + and suppose c1 = 1. Since x1 and x2 are unknown (and potentially irreducible because they could include symbols that cannot be evaluated), it could happen that during execution, x1 = MAX_INT and x2 = -1. Hence, in the original expression, (x1 op1 c1) = MAX_INT + 1 causes an overflow during execution, but the transformed expression does not cause an overflow: (x1 op1 x2) op c1 = (MAX_INT + -1) + 1 = MAX_INT.

The implementation and variants of this rule are similar to Rule 1.

Rule 3

The objective of the rule is to agglutinate the constants for immediate evaluation and push them outside of the expression, in hope that later rules will consume them in some other form.

(x1 op1 c1) op c2
----->
x1 op1 (c1 op c2)

Conditions:
op1 and op associate, i.e., (a op1 b) op c = a op1 (b op c)
If op is an integer operator THEN (c1 op c2) does not overflow
IF op1 = + THEN sign(c1) = sign(c1 op c2) AND |c1| <= |c1 op c2|
IF op1 = * THEN |c1 op c2| > 1
(I still need to find the conditions (if any) for modulo, /, -, and shift operators)
(NOTE: operators for booleans and strings do not require special conditions 
beyond associativity)

The special conditions on the + and * operators are necessary to ensure overflow preservation. Note also that the special conditions are on the constants, which is the only info we know at compile time.

As an example, consider op1 = + and op = -. Suppose c1 = -1 and c2 = 1. Since sign(c1) = -1 = sign(-1 - 1) = sign(c1 op c2) and |c1| = 1 <= 2 = |c1 op c2|, the rule applies. Suppose that during execution, x1 = MIN_INT. Hence, the original expression causes an overflow at (x1 op1 c1) = MIN_INT + -1. And, also, the transformed expression causes an overflow: x1 op1 (c1 op c2) = MIN_INT + (-1 - 1) = MIN_INT + -2. However, let us suppose that c2 = -1. Then, the condition 1 = |c1| <= |c1 op c2| = |-1 - -1| = 0 no longer applies, so that we cannot apply the rule. And indeed, the original expression still overflows at (x1 op1 c1), but the transformed expression would not overflow: x1 op1 (c1 op c2) = MIN_INT + (-1 - -1) = MIN_INT + 0 = MIN_INT.

Implementation of the rule: In addition to the associativity table, we keep a map that stores a lambda for each of the operators that require a special boolean condition. For example, for the + operator we would store in the table: conditions['+'] = lambda c1, c2, res => sign(c1) == sign(res) && abs(c1) <= abs(res).
Variations of the rule: Any commutation of the operands in the left-hand-side: (x1 op1 c1) op c2.

Other algebraic rules

After the previous rules, the following simpler rules can be applied in sequence in order to simplify the tree for specific operators (the commuted variants should be checked as well).

For integer expressions (the list still misses rules for modulo, division, and shifts):

x + 0 ---> x
x - 0 ---> x
x * 0 ---> 0
x * 1 ---> x
x / 1 ---> x
x - x ---> 0
x + x ---> 2 * x
-x ---> (-1) * x

Note that certain arithmetic rules that I did not list above, do not preserve overflows. For example, --x ---> x, because if x = MIN_INT, then -x produces an overflow, but the transformed expression removes the overflow. Hence, we cannot use rules like --x ---> x.

For boolean expressions:

x || true ---> true
x && false ---> false
x || false ---> x
x && true ---> x
x || x ---> x
x && x ---> x
x || !x ---> true
x && !x ---> false
!!x ---> x

I could add further rules for special functions, like string concatenation, max and min of integers, etc.

anton-trunov commented 5 days ago

As a first version, I will focus on optimizers that preserve the semantics (i.e., the condition with the biconditional).

Sounds like a great starting point!

For non-integer expressions, we must have val(O) = val(T)

There are also cell underflows and cell overflows, for instance:

beginCell()
  .storeBool(1)
  .then-add-too-much-stuff-so-cell-overflows()
.endCell().asSlice().loadBool()

can be reduced to true if you ignore cell overflows.

Awesome analysis of the rules, btw. I'd also probably associate some priorities to the rules that would affect the order in which the rules are supposed to be applied, e.g. all the monoidal zero rules should probably be tried first, before pushing the neutral element outside.

jeshecdom commented 4 days ago

Mmmm... I didn't know about cell overflows and underflows. Is it because of the limit in cell storage? I guess this would occur when calling functions in the expression. For the moment I am not supporting function calls in expressions. That will be the next step.

I also liked the idea of attaching priorities to the rules. I will do that.

anton-trunov commented 3 days ago

Is it because of the limit in cell storage?

A cell, which is a basic data structure in TVM, can have up to 1023 bits of data and up to 4 references to other cells. And yes there is a limit on the contract state size (which consists of cells, basically)

tact-lang / tact