AztecProtocol / barretenberg

Apache License 2.0
126 stars 77 forks source link

Efficient Polynomials For Sumcheck #184

Open adr1anh opened 1 year ago

adr1anh commented 1 year ago

Motivation

We currently store too many preprocessed polynomials in the proving key, like ID_1, ID_2, ID_3, and L_FIRST and L_LAST. All of these can be efficiently evaluatable by the verifier, so they do not need to be committed. Moreover, it may be more efficient to derive their "edge extension" directly by exploiting their structure.

adr1anh commented 1 year ago

ID Polynomials

Start by noting that $ID(X0,\ldots,X{d-1}) = X_0 + X1 \cdot 2 + \cdots + X{d-1} \cdot 2^{d-1}$, and that $ID_j(\vec{X}) = (j-1)n + ID(\vec{X})$. As part of indexing from 0, I suggest we relabel these with $j=0,1,2$ so that $ID_j(\vec{X}) = j \cdot n + ID(\vec{X})$.

The state required to define $ID(X0,\ldots,X{d-1})$ is only $d$, and possibly $j$ in order to define $ID{j}(X{0},\ldots,X_{d-1})$ (since $n = 2^d$).

At round $\ell$ of Sumcheck, we will have had to partially evaluate it in $u0,\ldots, u{\ell-1}$ which are the verifier Sumcheck challenges. We define this partially evaluated polynomial as

$$ \begin{align} ID^{(\ell)}j( X\ell, \ldots, X_{d-1}) &= ID_j(u0,\ldots,u{\ell-1}, X\ell, \ldots, X{d-1}) \ &= \underbrace{\left( j \cdot n + \sum_{k=0}^{\ell-1} uk \cdot 2^k \right)}{c\ell} + X\ell \cdot 2^\ell + \left( \sum_{k=\ell+1}^{d-1} Xk \cdot 2^k \right) \ &= c\ell + X\ell \cdot 2^\ell + \left( \sum{k=\ell+1}^{d-1} X_k \cdot 2^k \right) \end{align} $$

Going from $ID_j^{(\ell)}$ to $IDj^{(\ell+1)}$ by evaluating $X\ell$ in $u\ell$ means we update the constant $c{\ell+1} \gets c\ell + u\ell \cdot 2^\ell$ and increment $\ell$ to $\ell+1$. This constant is part of the polynomial's state, and we initialize $c_0 = j\cdot n$.

When we are iterating through the edge $(h{\ell+1}, \ldots, h{d-1}) \in {0,1}^{d-\ell-1}$, we let $h = h{\ell+1} + h{\ell+2}\cdot 2 + \cdots h_{d-1} \cdot 2^{d-\ell-2}$.

$$ \begin{align} ID^{(\ell)}j( X\ell, h{\ell+1} \ldots, h{d-1}) &= c\ell + X\ell \cdot 2^\ell + \left( \sum_{k=\ell+1}^{d-1} hk \cdot 2^k \right) \ &= c\ell + X\ell \cdot 2^\ell + 2^{\ell+1}\cdot \left( \sum{k=\ell+1}^{d-1} hk \cdot 2^{k-(\ell+1)} \right) \ &= c\ell + X\ell \cdot 2^\ell + 2^{\ell+1}\cdot \left( \sum{k=0}^{d-\ell-2} h{\ell+1+k} \cdot 2^{k} \right) \ &= c\ell + X_\ell \cdot 2^\ell + 2^{\ell+1}\cdot h \ \end{align} $$

The first edge univariate is $U0(X\ell) := c\ell + X\ell \cdot 2^\ell$, and if we have the edge univariate $U{h}(X\ell)$, we obtain $U{h+1}(X\ell) = U{h}(X\ell) + 2^{\ell+1}$.

template <typename FF>
struct IDUnivariate {
    FF       constant{0};  // c_l
    FF       two_pow_l{1}; // 2^l

    // Equal to ID_0
    IDUnivariate() = default;

    // Initialize to ID_j
    IDUnivariate(size_t d, size_t j)
    : constant{(1 << d) * j} {}

    // Partially evaluate in X_l = u_l
    void fold(const FF challenge /*u_l*/) {
        constant += challenge * two_pow_l;
        two_pow_l.self_dbl();
    }

    // Returns the (not-extended) univariate edge for the edge at index h
    Univariate<FF, 2> current_edge(size_t h) const {
        // Compute (c_l + 2^l * (2h)) + 2^l * X_l
        return {constant + two_pow_l * h, two_pow_l};
    }
}

Note that in sumcheck_round.hpp, we have h equal to $2 \cdot h$ (the index is incremented by 2 at every loop), so we can simplify the computation of the edge. We also realize that we don't need to store the number of variables $d$. We also don't need technically need to store the current round number $\ell$.

During the Sumcheck, both parties must run fold with the new challenge. After the last round, the veriifer can get the result of the evaluation by querying constant.

adr1anh commented 1 year ago

Lagrange Polynomials

The Lagrange polynomial at vertex $(i0,\ldots,i{d-1}) \in {0,1}^d$, corresponding to the index $i = i_0 + i1\cdot 2 + \cdots i{d-1}\cdot 2^{d-1}$, is given in a succinct form as

$$ L_i(X0,\ldots,X{d-1}) = \prod{\ell=0}^{d-1} \Big( (1-i\ell)\cdot(1-X\ell) + i\ell \cdot X_\ell \Big) $$

Once again, we define its partial evalution form as

$$ \begin{align} L^{(\ell)}i( X\ell, \ldots, X_{d-1}) &= L_i(u0,\ldots,u{\ell-1}, X\ell, \ldots, X{d-1}) \ &= \underbrace{\prod_{k=0}^{\ell-1} \Big( (1-i_k)\cdot(1-u_k) + i_k \cdot uk \Big) }{c\ell}\ &\phantom{=} \cdot \Big( (1-i\ell)\cdot(1-X\ell) + i\ell \cdot X\ell \Big) \ &\phantom{=} \cdot \underbrace{\prod{k=\ell+1}^{d-1} \Big( (1-i_k)\cdot(1-X_k) + i_k \cdot Xk \Big)}{1{(i{\ell+1},\ldots,i{d-1}) = (X{\ell+1},\ldots,X{d-1})}} \ &= c\ell \cdot \Big( (1-i\ell)\cdot(1-X\ell) + i\ell \cdot X\ell \Big) \cdot 1{(i{\ell+1},\ldots,i{d-1}) = (X{\ell+1},\ldots,X{d-1})} \ &= c\ell \cdot \Big( (1-i\ell) + (2 \cdot i\ell-1) \cdot X\ell \Big) \cdot 1{(i{\ell+1},\ldots,i{d-1}) = (X{\ell+1},\ldots,X{d-1})} \ &= 1{(i{\ell+1},\ldots,i{d-1}) = (X{\ell+1},\ldots,X{d-1})} \cdot \begin{cases} c\ell - c\ell \cdot X\ell &\text{if } i\ell = 0 \ 0 + c\ell \cdot X\ell &\text{if } i\ell = 1 \end{cases} \end{align} $$

When we are iterating through the edge $(h{\ell+1}, \ldots, h{d-1}) \in {0,1}^{d-\ell-1}$, we let $h = h{\ell+1} + h{\ell+2}\cdot 2 + \cdots h_{d-1} \cdot 2^{d-\ell-2}$ be the index of this edge, we notice that

$$ L^{(\ell)}i( X\ell, h{\ell+1}, \ldots, h{d-1}) = \begin{cases} 0 &\text{if} (h{\ell+1}, \ldots, h{d-1}) \neq (i{\ell+1}, \ldots, i{d-1})\ c\ell - c\ell \cdot X\ell &\text{if } i\ell = 0 \ 0 + c\ell \cdot X\ell &\text{if } i_\ell = 1 \end{cases} $$

Crucially, the edge univariate $Uh(X\ell)$ is zero for all but one value of $h$, which means that we almost never need to do any work when extending this edge.

When we do the partial evaluation in $X\ell = u\ell$, we only need to set

$$ \begin{align} c{\ell+1} &\gets c\ell \cdot \Big( (1-i\ell)\cdot(1-u\ell) + i\ell \cdot u\ell \Big) \end{align} $$

template <typename FF>
struct LagrangeUnivariate {
    FF       constant{1};  // c_l
    bool     i_l;          // i_l
    uint64_t i_rest;       // (i_{l+1}, ..., i_{d-1}) 

    explicit LagrangeUnivariate(uint64_t i)
    : i_l{i % 2}
    , i_rest{i / 2}
    {};

    // Partially evaluate in X_l = u_l
    void fold(const FF challenge /*u_l*/) {
        if (i_l) { // i_l = 1
            constant *= challenge;
        } else {   // i_l = 0
            constant *= (FF{1} - challenge)
        }
        i_l = bool(i_rest % 2);  // i_{l+1}
        i_rest >> 1;             // (i_{l+2}, ..., i_{d-1}) 
    }

    // Returns the (not-extended) univariate edge for the edge at index h
    Univariate<FF, 2> current_edge(size_t h) const {
        // The index we get is always a multiple of 2
        // i.e.     h ~ (0, h_{l+1}, ..., h_{d-1})
        // and i_rest ~ (   i_{l+1}, ..., i_{d-1})
        if ((h >> 1) == i_rest) {
            return {FF{0}, FF{0}};
        } else {
            if (i_l) { // i_l = 1
                // 0 + c_l * X_l
                return {FF{0}, constant};
            } else {   // i_l = 0
                // c_l + c_l * X_l
                return {constant, -constant};
            }
        }
    }
}
adr1anh commented 1 year ago

Implementation & Performance considerations

The main challenge with implementing this issue is not about the polynomials themselves, but rather which component of the proof system should be responsible for them.

For example, the $ID$ polynomial seems like it would only really be used by the GrandProductRelation, so it might make sense to let this component store the polynomial. At the moment, our relations are stateless though it's not clear why this needs to be the case. If we allow them to hold state, we could do away with the RelationParameters and instead initialize each relation with the data it needs. For example, we could store the beta, gamma, public_input_delta directly in the GrandProductRelation, and it would also take care of initializing its own LAGRANGE_FIRST selector.

For the ID polynomials, we need to consider the upgrade to UltraPlonk where the IDs are actually committed as part of the verification key. We will also have to consider the case where we are combining it with the lookup argument.

Lagrange polynomials as "selectors"

As discussed in Mexico, there are overwhelming benefits to handling "selectors" separately from other "fixed" preprocessed polynomials. By "selector", we refer to polynomials whose evaluations over the hypercube are boolean, and stored in blocks (preferably whose sizes are powers of 2). In this case, the univariate edges for these selectors will be identically 0, which means that any thing they multiply does not need to be evaluated. A Lagrange polynomial is just a selector that

It may be useful to use std::optional (or some other way of flagging the 0 polynomial) that would indicate whether the edge actually needs extending, and avoiding any expensive multiplication.

adr1anh commented 1 year ago

The name ****Univariate really isn't great, I just copied it from when I had created PowUnivariate. It might make sense to call it ****MultiLinearExtension and add methods to return the full Polynomial for testing purposes. At the same time, it could unify the two implementations of pow.