Maintain common subexpressions in the expression graph

If there isn’t too many common subexpressions, deduplication or memoizing should not be too effective. Instead, the gate expressions need to be rearranged to reveal common or constant subexpressions.

It is true that individual gates are given in an efficient factorized form, such as gate(a, b, c, d…) = (a+1)(b+c), where a, b, c, d… are the available queries. But that is not true of the linear combination of gates:

all_gates(a, b, c, d…) = y^0.(a+1)(b+c) + y^1.(a+1)(b+c) + y^2.(a+1)(b+c) + y^3.(a+1)(b+c) + … (hundreds)

which looks almost like a monomial form in the queries ab, ac, …. It has a subset of the terms allowed by the max degree, and many repeated terms (e.g, y^0.ab + y^1.ab.

So it turns out to be a case of Horner’s rule, with multiple variables. The strategy is then to find a Horner factorization of all_gates at compile time (example paper). Here is an example with one level of factorization, schematically:

all_gates(a, b, c, d…) = L + a.R

    L = (y^0 + y^1 + …).(b+c)
    R = (y^0 + y^1 + …).(b+c)

The powers of y are kept as symbols that appear in subexpressions, to evaluate only once by memoizing. The evaluation at all rotations exploits a fully factored form rather than a long sum of partial forms.

zcash / halo2

Maintain common subexpressions in the expression graph #686