0xPolygonMiden / miden-vm

STARK-based virtual machine
MIT License
620 stars 157 forks source link

Proposal for switching to a Virtual Bus #1182

Open Al-Kindi-0 opened 9 months ago

Al-Kindi-0 commented 9 months ago

INTRO

In many situations, within the verifiable computation paradigm, one is interested, say for example in the context of zkVMs, in whether a certain, global over the execution trace, algebraic relation holds. To prove that this latter algebraic relation has been computed correctly, the prover can use the AIR (or more precisely the RAP) in order to compute the global relation via a sequence of local (i.e., transition) constraints applicable to some intermediate values. These values then become part of the execution trace even though they are artificial with respect to the original, say, zkVM execution trace. From the perspective of both the prover and verifier, the intermediate values' sole purpose is to witness the correct computation/enforcement of the local steps used in computing/enforcing the global relation/constraint. Unfortunately, this comes at the cost of extra data that the prover needs to commit to, which might be problematic considering that the commitment cost is, as of today, still the dominant cost in verifiable computation.

A second option is to use a protocol that is more tailored to the situation we have at hand, namely evaluating a (global) algebraic relation without any extra commitments (e.g., to the intermediate results of the computation) beyond those already required for the original trace. GKR is an interactive proof protocol for proving correct evaluation of a layered circuit [^1] and is the ideal candidate for solving our stated problem. This idea was used recently in Lasso, in order to enforce a Grand-Product argument, and variations of the same principle appear in earlier works (e.g., Quarks).

LogUp

Consider the LogUp lookup argument. This is a protocol for proving that a certain set $\lbrace w(x_i)\rbrace$ is a subset of another $\lbrace t(xi)\rbrace$. In its most basic form it can be written as the following global algebraic relation $$\sum{x\in\mathbb{H}} \frac{m(x)}{\alpha + t(x)} = \sum_{x\in\mathbb{H}} \frac{f(x)}{\alpha + w(x)}$$ where:

  1. $\mathbb{H}$ is a multiplicative cyclic group enumerating the rows of the trace of size $|\mathbb{H}| = n$.
  2. $t(x)$ is the value at the "x-th" row of table $t$.
  3. $m(x)$ is the number of times the value $t(x)$ appears in the vector $\lbrace w(x): x \in \mathbb{H}\rbrace$ i.e., the multiplicity.
  4. $w(x)$ is the value at the "x-th" row of table $w$.
  5. $f(x)$ is a binary flag table used to indicate whether or not value $w(x)$ should be "looked-up" in table $t$

Miden VM uses the concept of a bus in order to enforce the above constraint. More precisely, it defines an extra column $p$ with the following constraints:

  1. $p(x0) = p(x{n-1}).$
  2. $p^{}(xi) = p(x{i-1}) + \frac{m(x{i-1})}{\alpha + t(x{i-1})} - \frac{f(x{i-1})}{\alpha + w(x{i-1})}$.

Notice here the issue discussed in the introduction, namely that LogUp introduces an extra witness $p$ that is in a sense extrinsic to the original statement.

Multivariate formulation

To describe the GKR-based solution to the previous issue, we need to formulate the problem in the language of multivariate polynomials instead of uni-variate ones. More precisely, we viewed the columns of the execution trace as maps from $\mathbb{H}\rightarrow \mathbb{F}_q$ where $\mathbb{F}_q$ is the base field. In the multivariate formulation, we look at each column now as a map from $\lbrace 0, 1\rbrace^{\nu}\rightarrow\mathbb{F}_q$ where $\nu = log_2(n)$ assuming $n$ is a power of $2$. The next thing we need to do is extend functions $f: \lbrace0, 1\rbrace^{\nu}\rightarrow\mathbb{F}_q$ to functions in $\mathbb{F}_q^{\nu}\rightarrow\mathbb{F}_q$ and this is done using the concept of (unique) multi-linear extension (MLE) polynomial. This is the unique multi-linear polynomial $\tilde{f}$ in $\mathbb{F}_q[X_0,\dots, X_{\nu - 1}]$ such that for any $x:= (x_0,\dots, x_{\nu-1}) \in \lbrace0, 1\rbrace^{\nu}$ we have $\tilde{f}(x) = f(x)$. One can show[^2] that the multi-linear extension is unique and it is given[^3] by

$$\tilde{f}(x) := \sum_{y\in\lbrace 0, 1\rbrace^{\nu}} f(y)\cdot\mathsf{EQ}(x, y)$$

where

$$\mathsf{EQ}(x, y) := \prod_{i=0}^{\nu - 1}\left(x_i\cdot y_i + (1-x_i)\cdot(1- y_i)\right)$$

Using the new formalism, we can write our constraint as $$\sum{x\in\lbrace 0, 1\rbrace^{\nu}} \frac{\tilde{m}(x)}{\widetilde{\left(\alpha + t\right)}(x)} = \sum{x\in\lbrace 0, 1\rbrace^{\nu}} \frac{\tilde{f}(x)}{\widetilde{\left(\alpha + w\right)}(x)}$$ which is equivalent to the previous identity by virtue of the extension property of the MLEs. This can be further simplified by defining $$p(x0,\dots, x{\nu}) := (1 - x_0)\cdot \tilde{m}(x1,\dots ,x{\nu}) + x_0\cdot\left(-\tilde{f}(x1,\dots ,x{\nu})\right)$$ and $$q(x0,\dots, x{\nu}) := (1 - x_0)\cdot \widetilde{\left(\alpha + t\right)}(x1,\dots ,x{\nu}) + x_0\cdot\left(-\widetilde{\left(\alpha + w\right)}(x1,\dots ,x{\nu})\right)$$ which are again multi-linear polynomials. Our constraint thus becomes $$\sum_{x\in\lbrace 0, 1\rbrace^{\nu + 1}} \frac{p(x)}{q(x)} = 0$$ The reason for introducing MLE notion will become clear in the next sections. From now on, we will overload the notation and omit the $\quad\tilde{}\quad$.

Sum-check

Imagine our task was to prove the simpler relation $$\sum{x\in\lbrace 0, 1\rbrace^{\mu}} p(x) = 0$$ instead. The sum-check protocol solves exactly this and we now look at how it works. Assume the generalized problem
$$\sum
{x\in\lbrace 0, 1\rbrace^{\mu}} g\left(p_0(x),\dots,p_v(x)\right) = 0$$ where $p_i$ are multi-linear polynomials to which the verifier has oracle access to and $g$ is a polynomial of degree at most $d$ which the verifier can evaluate on its own. The sum-check protocol in this case can be described as follows:

  1. In the zeroth round the prover sends a uni-variate polynomial $s_0(X0)$ claimed to be equal to $$\sum{(x1,\dots ,x{\mu-1})\in\lbrace 0, 1\rbrace^{\mu}} g\left(p_0(X_0, x1,\dots ,x{\mu-1}),\dots,p_v(X_0,x1,\dots ,x{\mu-1})\right)$$
  2. The verifier checks that $s_0$ is of degree at most $d$ and that $s_0(0) + s_0(1) = 0$.
  3. The verifier draws a random $r_0$, sends it to the prover and they both agree on $s_0(r_0)$ as the new claim (instead of the previous claim i.e., $0$).
  4. At round $j$ for $1\leq j\leq\mu-1$:
    1. The prover sends a uni-variate polynomial $s_j(Xj)$ claimed to be equal to $$\sum{(x{j+1},\dots ,x{\mu-1})\in\lbrace 0, 1\rbrace^{\mu}} g\left(p_0(r0,\dots,r{j-1},Xj, x{j+1},\dots ,x_{\mu-1}),\dots,p_v(r0,\dots,r{j-1},Xj, x{j+1},\dots ,x_{\mu-1})\right)$$
    2. The verifier checks that $s_j(X_j)$ is of degree at most $d$ and that $s_j(0) + sj(1) = s{j-1}(r_{j-1})$
    3. The verifier draws a random $r_j$, sends it to the prover and they both agree on $s_j(r_j)$ as the new claim.
  5. At this point, the verifier queries the oracles $p_i$ at $(r0,\dots,r{\mu-1})$ and checks that $s{\nu-1}(r{\nu-1}) = g\left(p_0(r0,\dots,r{\mu-1}),\dots , p_{\nu-1}(r0,\dots,r{\mu-1})\right).$

To sum up, the sum-check protocol reduces the claim $$\sum_{x\in\lbrace 0, 1\rbrace^{\mu}} g\left(p_0(x),\dots,p_v(x)\right) = 0$$ to the evaluation of $p_i$'s at a random point $(r0,\dots,r{\mu-1})$ which the verifier can check using one query to each of the oracles $p_i$ (as a multi-linear function).

GKR for sum of fractions

A Warm-Up

Recall that our main task was reduced to proving that $$\sum_{x\in\lbrace 0, 1\rbrace^{\nu + 1}} \frac{p(x)}{q(x)} = 0$$ The prover and verifier will run a GKR IP protocol in order to prove the validity of the above constraint. To this end, we will construct a (layered) circuit in order to evaluate the left-hand side expression. To give some intuition, we start with a simple example in the case when $\nu + 1 = 2$. In this case, the expression becomes: $$\frac{p(0, 0)}{q(0, 0)} + \frac{p(0, 1)}{q(0, 1)} + \frac{p(1, 0)}{q(1, 0)} + \frac{p(1, 1)}{q(1, 1)}$$ which can be computed/reduced to an expression with half the number of terms $$\frac{p^{'}(0)}{q^{'}(0)} + \frac{p^{'}(1)}{q^{'}(1)}$$ where $$\frac{p^{'}(0)}{q^{'}(0)} = \frac{p(0, 0)\cdot q(0, 1) + p(0, 1)\cdot q(0, 0)}{q(0, 0)\cdot q(0, 1)}$$ and $$\frac{p^{'}(1)}{q^{'}(1)} = \frac{p(1, 0)\cdot q(1, 1) + p(1, 1)\cdot q(1, 0)}{q(1, 0)\cdot q(1, 1)}$$

This can be written as $$p^{'}(x) := \sum_{y\in\lbrace 0,1\rbrace} \mathsf{EQ}(x, y)\cdot \left(p(y,0)\cdot q(y, 1) + p(y,1)\cdot q(y, 0) \right)$$

$$q^{'}(x) := \sum_{y\in\lbrace 0,1\rbrace} \mathsf{EQ}(x, y)\cdot \left(q(y,0)\cdot q(y, 1) \right)$$

We can now describe how the GKR protocol works in our simplified example. The GKR protocol runs in the opposite direction to that of the evaluation of the circuit i.e.,

  1. The verifier receives $p^{'}(0), p^{'}(1), q^{'}(0), q^{'}(1)$ together with oracles (i.e., commitments) to the input layer i.e., $p$ and $q$.
  2. The verifier draws randomness $r$ and reduces the claim on $4$ values to a claim on the values $p^{'}(r) = (1-r)\cdot p^{'}(0) + r\cdot p^{'}(1)$ and $q^{'}(r) = (1-r)\cdot q^{'}(0) + r\cdot q^{'}(1)$.
  3. The prover and verifier engage in two sum-check protocols on claims $p^{'}(r)$ and $q^{'}(r)$ i.e., the prover engages in a protocol to convince the verifier that $$p^{'}(r) := \sum{y\in\lbrace 0,1\rbrace} \mathsf{EQ}(r, y)\cdot \left(p(y,0)\cdot q(y, 1) + p(y,1)\cdot q(y, 0) \right)$$ $$q^{'}(r) := \sum{y\in\lbrace 0,1\rbrace} \mathsf{EQ}(r, y)\cdot \left(q(y,0)\cdot q(y, 1) \right)$$
  4. In our simple example, we have only one variable in the summation and thus the prover will compute and send a polynomial $g_p$ claimed to satisfy $$g_p(z) = \mathsf{EQ}(r, z)\cdot \left(p(z,0)\cdot q(z, 1) + p(z,1)\cdot q(z, 0) \right).$$ Since $g_p$ will be of degree at most $3$, the prover sends either $4$ coefficients or $4$ evaluations of $g_p$. The same happens for $g_q$.
  5. The verifier checks that $g_p(0) + g_p(1) = p^{'}(r)$ and $g_q(0) + g_q(1) = q^{'}(r)$.
  6. The verifier draws a random value $u$ and sends it the prover who responds with $p(u,0), p(u, 1), q(u, 0)$ and $q(u, 1)$.
  7. The verifier computes $\mathsf{EQ}(r, u)$ on its own.
  8. The verifier checks that $$g_p(u) = \mathsf{EQ}(r, u)\cdot \left(p(u,0)\cdot q(u, 1) + p(u,1)\cdot q(u, 0)\right)$$ and the same for $g_q(u)$.
  9. The verifier draws a random value $v$, computes $p(u, v) = (1-v)\cdot p(u, 0) + v\cdot p(u, 1)$ and $q(u, v) = (1-v)\cdot q(u, 0) + v\cdot q(u, 1)$ and queries the oracles of $p$ and $q$ to check that the previous $q(u,v)$ and $p(u,v)$ values are correct.

The General Case

The circuit

The circuit we are interested in is composed of $\mu$ layers with the $\mu$-th layer being the input layer and is located at the bottom. The input layer feeds into the next layer above it, i.e., $\mu-1$, and this goes on until one reaches the $1$-st layer. More precisely:

  1. For the $\mu$-th layer, there are $2^{\mu+1}$ input wires with the values $\lbrace p{\mu}(x)\rbrace{x\in\lbrace 0,1\rbrace^{\mu}}$ and $\lbrace q{\mu}(x)\rbrace{x\in\lbrace 0,1\rbrace^{\mu}}$.
  2. For the $\kappa$-th layer, there are $2^{\kappa+1}$ output wires $\lbrace p{\kappa}(x)\rbrace{x\in\lbrace 0,1\rbrace^k}$ and $\lbrace q{\kappa}(x)\rbrace{x\in\lbrace 0,1\rbrace^k}$ where $$p{\kappa}(x) = p{\kappa + 1}(x, 0)\cdot q{\kappa + 1}(x, 1) + p{\kappa + 1}(x, 1)\cdot q{\kappa + 1}(x, 0)$$ and $$q{\kappa}(x) = q{\kappa + 1}(x, 0)\cdot q{\kappa + 1}(x, 1)$$
  3. The first layer has output wires $p_1(0), p_1(1), q_1(0), q_1(1)$.

Notice that we can associate with each layer the following two MLEs $$p{\kappa}(x) = \sum{y\in\lbrace 0,1\rbrace^k} \mathsf{EQ}(x, y)\cdot\left( p{\kappa + 1}(y, 0)\cdot q{\kappa + 1}(y, 1) + p{\kappa + 1}(y, 1)\cdot q{\kappa + 1}(y, 0)\right)$$ $$q{\kappa}(x) = \sum{y\in\lbrace 0,1\rbrace^k} \mathsf{EQ}(x, y)\cdot\left( q{\kappa + 1}(y, 0)\cdot q{\kappa + 1}(y, 1) \right)$$ and in particular we can can make sense of evaluating $p{\kappa}$ and $p{\kappa}$ at tuples in $\mathbb{F}_q$.

The protocol

The GKR protocol works from the output layer backwards and works essentially by reducing a claim about the output of one layer to a similar claim on the outputs of the layer feeding into it. This continues until one reaches the input layer (i.e., layer $\mu$) at which the verifier makes a query to the oracles it received at beginning of the interaction.

More precisely:

  1. The prover sends commitments to the MLEs of $p{\mu}$ and $q{\mu}$ as well as the values of the output wires $p_1(0), p_1(1), q_1(0), q_1(1)$.
  2. The verifier checks that $p_1(0)\cdot q_1(1) + p_1(1)\cdot q_1(0) = 0$ and that $q_1(0)\neq 0$ and $q_1(1) \neq 0$.
    1. Round 1:
      1. Verifier samples a random $r_1 = \gamma_1$, sends it to the prover and they both reduce the original claim to a claim on $p_1(r_1), q_1(r_1)$ where $p_1(r_1) = p_1(\gamma_1) = (1 - \gamma_1)\cdot p_1(0) + \gamma_1\cdot p_1(1)$ and $q_1(r_1) = q_1(\gamma_1) = (1 - \gamma_1)\cdot q_1(0) + \gamma_1\cdot q_1(1)$.
    2. Round $k$ for $2 \leq \kappa \leq \mu - 1$:
      1. The prover engages in a sum-check protocol to convince the verifier that the claims $p{\kappa}(r{\kappa})$ and $q{\kappa}(r{\kappa})$ satisfy $$p{\kappa}(r{\kappa}) = \sum{y\in\lbrace 0,1\rbrace^{\kappa}} \mathsf{EQ}(r{\kappa}, y)\cdot\left( p{\kappa + 1}(y, 0)\cdot q{\kappa + 1}(y, 1) + p{\kappa + 1}(y, 1)\cdot q{\kappa + 1}(y, 0)\right)$$ and $$q{\kappa}(r{\kappa}) = \sum{y\in\lbrace 0,1\rbrace^{\kappa}} \mathsf{EQ}(r{\kappa}, y)\cdot\left( q{\kappa + 1}(y, 0)\cdot q{\kappa + 1}(y, 1) \right)$$
      2. The sum-check protocol itself will reduce the claim in the previous point to an evaluation claim on the MLE of layer $\kappa +1$, namely a claim on $$p_{\kappa + 1}(\rho0,\dots,\rho{\kappa}, 0), p_{\kappa + 1}(\rho0,\dots,\rho{\kappa}, 1), q_{\kappa + 1}(\rho0,\dots,\rho{\kappa}, 0), q_{\kappa + 1}(\rho0,\dots,\rho{\kappa}, 1)$$
      3. The verifier samples a random $\gamma{\kappa +1}$, sends it the prover and they both reduce the previous claim to the new claim $$p{\kappa + 1}(r{\kappa + 1}), q{\kappa + 1}(r{\kappa + 1})$$ where $$r{\kappa + 1} := \left( \rho0,\dots,\rho{\kappa}, \gamma_{\kappa+1} \right)$$
    3. At this point the verifier is left with two claims $$p{\mu}(r{\mu}), q{\mu}(r{\mu})$$ and it just queries the oracles $p{\mu}$ and $q{\mu}$ at $r_{\mu}$ to check them.

In the context of our basic example of LogUp, the last point entails a claim on $t, m, f, w$ at $(r{\mu , 1},\dots, r{\mu, \mu})$ since $$p{\mu}(r{\mu}) = (1-r{\mu , 0})\cdot m(r{\mu , 1},\dots, r{\mu, \mu}) - r{\mu , 0}\cdot f(r{\mu , 1},\dots, r{\mu, \mu})$$ and $$q{\mu}(r{\mu}) = (1-r{\mu , 0})\cdot (\alpha + t)(r{\mu , 1},\dots, r{\mu, \mu}) - r{\mu , 0}\cdot (\alpha + w) (r{\mu , 1},\dots, r{\mu, \mu})$$ where we used $r{\mu} := (r{\mu , 0}, r{\mu , 1},\dots, r{\mu, \mu}).$

Simulating a Multi-linear Polynomial Commitment scheme

The remaining question now is implementing the final oracle query by the verifier at the random point resulting from the GKR protocol interaction. To be more precise, suppose $t$ is a column of the main trace, then the question is how can the prover compute $t(r0,\dots,r{\nu - 1})$ and convince the verifier that it has done so honestly. First, let's go back to the dual view of $t$, i.e., both as a uni-variate and as a multivariate polynomial. Suppose $\mathbb{H} := \lbrace g^{i}: i\in\lbrace 0,\dots ,2^{\nu}-1\rbrace\rbrace$ for a primitive $2^{\nu}$-th root of unity. This means that for any $y = g^i\in\mathbb{H}$ we have $$y = g^i = g^{i_0 + i1\cdot 2 + \dots + i{\nu-1}\cdot 2^{\nu - 1}}$$ which we can use to define the mapping (with notational overload) $$t(i0,\dots ,i{\nu -1}) = t(g^i) = t(y)$$ This then gives meaning to $$t(r0,\dots ,r{\nu -1}) = \sum_{(i0,\dots , i{\nu - 1})\in\lbrace 0, 1\rbrace^{\nu}} \mathsf{EQ}\left((r0,\dots ,r{\nu -1}), (i0,\dots , i{\nu - 1})\right) t\left(g^{i_0 + i1\cdot 2 + \dots + i{\nu-1}\cdot 2^{\nu - 1}}\right)$$ where $$\mathsf{EQ}((r0,\dots ,r{\nu -1}), (i0,\dots , i{\nu - 1})) = \prod_{j=0}^{\nu - 1}\left(r_j\cdot i_j + (1-r_j)\cdot(1- i_j)\right)$$ Thus, $t(r0,\dots ,r{\nu -1})$ is the inner product of two vectors and hence, given oracle access to the (uni-variates) $t$ and $\mathsf{c} := (c_i)_{0\leq i\leq 2^{\nu} - 1}$ where $$c_i := \mathsf{EQ}\left((r0,\dots ,r{\nu -1}), (i0,\dots , i{\nu - 1})\right)$$ for $i=i_0 + i1\cdot 2 + \dots + i{\nu-1}\cdot 2^{\nu - 1}$, the verifier can assertain $t(r0,\dots ,r{\nu -1})$ using a uni-variate sum-check protocol. This can be done either as part of the auxiliary trace, at the cost of an extra column and a quadratic transition constraint, or outside using an additional protocol as in e.g., Aurora.

The Lagrange kernel oracle

As part of the auxiliary trace, and given the random values $(r0,\dots ,r{\nu -1})$, the column $\mathsf{c}$ satisfies the following sets of constraints:

  1. It starts with $c_0 = (1 - r0)\dots (1-r{\nu-1})$.
  2. For each ${\kappa}\in\lbrace 1,\dots, \nu\rbrace$:
    1. Let $\mathbb{I}_{\kappa} := \lbrace j\cdot 2^{{\kappa}}: 0\leq j < 2^{\nu - {\kappa}} \rbrace$
    2. For each $i\in\mathbb{I}_{\kappa}$, $c_{i + 2^{{\kappa}-1}} =\frac{r_{{\kappa}-1}}{1-r_{{\kappa}-1}}\cdot c_i$

The above can be encoded using:

  1. The boundary constraint $c(X) - (1 - r0)\dots (1-r{\nu-1}) = 0$ at $X = 1$, i.e., on the sub-group $\mathbb{H}_0 := \lbrace 1\rbrace$.
  2. Transition constraints:
    1. Define $\mathbb{H}_{\kappa} = \lbrace g_{\kappa}^i: i\in\lbrace 0,\dots,2^{\kappa} - 1\rbrace\rbrace$ where $g_{\kappa} = g^{2^{\nu-\kappa}}$ and $g$ is as before, a primitive $2^\nu$-th root of unity.
    2. For each ${\kappa}\in\lbrace 1,\dots, \nu\rbrace$, $$r{{\nu - \kappa}}\cdot c(X) - (1 -r{\nu - \kappa})\cdot c(g{{\kappa}}\cdot X) = 0$$ holds on $\mathbb{H}{\kappa - 1}$.

Description of the full protocol

  1. The prover sends commitments to $m, f, t, w$.
  2. The verifier sends the prover a random challenge $\alpha$.
  3. The prover uses $\alpha, m, f, t$ and $w$ to evaluate the circuit in the GKR protocol and it responds with the output of the circuit i.e., $p_1(0), p_1(1), q_1(0), q_1(1)$.
  4. The verifier checks that $q_1(0) \neq 0$, $q_1(1) \neq 0$ and $p_1(0)\cdot q_1(1) + p_1(1)\cdot q_1(0) = 0$.
  5. The verifier draws a random $r_1=\gamma_1$ and sends it to the prover. Both parties compute the new claim $(q_1(r_1), p_1(r_1))$ (about the output of the circuit) as $p_1(r_1) = (1 - \gamma_1)\cdot p_1(0) + \gamma_1\cdot p_1(1)$ and $q_1(r_1) = (1 - \gamma_1)\cdot q_1(0) + \gamma_1\cdot q_1(1)$.
  6. The prover and verifier engage in a sequence of sum-checks which culminate in a final claim $(p{\nu + 1}(r{\nu + 1}), q{\nu + 1}(r{\nu + 1}))$, where $r_{\nu + 1}$ is the randomness generated during the course of the final sum-check.
  7. The prover will provide, as part of the last step of the final sum-check, the evaluations of the "original" oracles at $(r{\nu,{1}},\dots ,r{\nu , \nu - 1}, 0)$ and $(r{\nu,{1}},\dots ,r{\nu , \nu - 1}, 1)$ and the verifier will make sure that they satisfy the last check in the final sum-check protocol.
  8. The prover constructs the column $\mathsf{c} = \left(c_i\right)_{0\leq i \leq \nu - 1}$ representing the Lagrange kernel oracle at $(r_{\nu,{1}},\dots ,r_{\nu , \nu })$ and sends its commitment to the verifier.
  9. The verifier draws a random $\lambda$, sends it to the prover and proves to the verifier that $$l(x0,\dots, x{\nu-1}) := m(x0,\dots, x{\nu-1}) + \lambda f(x0,\dots, x{\nu-1}) + \lambda^2 t(x0,\dots, x{\nu-1}) + \lambda^3 w(x0,\dots, x{\nu-1})$$ evaluates to $$m(r{\nu,{1}},\dots ,r{\nu , \nu }) + \lambda f(r{\nu,{1}},\dots ,r{\nu , \nu }) + \lambda^2 t(r{\nu,{1}},\dots ,r{\nu , \nu }) + \lambda^3 w(r{\nu,{1}},\dots ,r{\nu , \nu })$$ at $$(r{\nu,{1}},\dots ,r{\nu , \nu })$$ using the Lagrange kernel oracle $$\mathsf{c}$$ and the uni-variate sum-check protocol.

Next Steps

The main points that need further considerations are:

  1. Understanding the costs associated with the Lagrange kernel.
  2. The optimal way to implement the ultimate sum-check.
  3. Handling additional buses.

These will be discussed in future separate issues.

[^1]: Log-uniform circuits i.e., roughly circuits with a succinct description. [^2]: Fact 3.5 in Proofs, arguments and zero-knowledge (PAZK) by Justin Thaler. [^3]: Lemma 3.6 in PAZK.

plafer commented 7 months ago

EDIT: I adjusted the section after discussing with @Al-Kindi-0 (which cleared up the remaining misunderstanding). It is now dedicated to explaining the "why" behind the Lagrange kernel's transition constraints as described here.

I'd like to zoom in on the Lagrange kernel transition constraints, as it took me a bit of time to understand what was going on, and would like to make it easier for others.

This writeup will use a motivating example with $\upsilon = 3$ (i.e. an 8-element trace). We will also use the notation $r = (r_2, r_1, r_0)$.

Problem recap

The so-called "Lagrange kernel" is the above-mentioned function

$$\mathrm{EQ}(x, y) = \prod_{i=0}^{\upsilon-1} x_i y_i + (1-x_i)(1-y_i).$$

We want to create a column $c$ such that $c_i = \mathrm{EQ}(i, r)$ for some random $r$.

Naive constraints

To ensure that c is properly built, we need 8 constraints:

$$ \begin{align} c_0 &= \mathrm{EQ}(0, r) = (1-r_2)(1-r_1)(1-r_0) \ c_1 &= \mathrm{EQ}(1, r) = (1-r_2)(1-r_1)r_0 \ c_2 &= \mathrm{EQ}(2, r) = (1-r_2)r_1(1-r_0) \ c_3 &= \mathrm{EQ}(3, r) = (1-r_2)r_1r_0 \ c_4 &= \mathrm{EQ}(4, r) = r_2(1-r_1)(1-r_0) \ c_5 &= \mathrm{EQ}(5, r) = r_2(1-r_1)r_0 \ c_6 &= \mathrm{EQ}(6, r) = r_2r_1(1-r_0) \ c_7 &= \mathrm{EQ}(7, r) = r_2r_1r_0 \end{align} $$

The first constraint is a boundary constraint. Since this note focuses on the transition constraints, we will only look at the last 7.

Rewriting the naive transition constraints

Notice that we can rewrite the naive transition constraints in terms of one another:

$$ c_1 = \frac{r_0}{1-r_0}c_0 $$

$$ c_3 = \frac{r_0}{1-r_0}c_2 $$

$$ c_5 = \frac{r_0}{1-r_0}c_4 $$

$$ c_7 = \frac{r_0}{1-r_0}c_6 $$

$$ c_2 = \frac{r_1}{1-r_1}c_0 $$

$$ c_6 = \frac{r_1}{1-r_1}c_4 $$

$$ c_4 = \frac{r_2}{1-r_2}c_0 $$

Note that there are more than one way to write these constraints; any two rows can be written in terms of one another if their index differs by 1 bit (i.e. they have a Hamming distance of 1). We chose this way specifically to serve the following sections.

Reminder: group and subgroup generators

Before we introduce the core idea of this note (i.e. using constraint enforcement domains to reduce the number of constraints), I think it's useful to review how generators of subgroups are built.

The VM uses the group of size $2^{64} - 2^{32} + 1$. This group has a subgroup of size $2^{32}$. In the lingo, we say that the group has a "two adicity of 32"; more on this in this note. As explained in the note, by Lagrange's theorem, our group also has subgroups of size $2^1$, $2^2$, ... $2^{31}$.

These groups are cyclic, meaning that they can be generated by a single element. For our original group, the generator is $7$ (defined here in winterfell). We'll care more about the generator of the subgroup of size $2^{32}$. To be concrete, its value is $g = 7277203076849721926$ (defined here in winterfell). Throughout this note, we will use g to denote the generator of the subgroup of size $2^{32}$.

We also know how to compute the generator of all other smaller subgroups (of size $2^2$, $2^3$, ..., $2^{31}$), as explained in the previously-mentioned note. In winterfell, these are computed here. We will call these generators $g_s$ for size $s$.

Specifically, for the purposes of our example, we will only need the generator for the subgroup of size 8 (the length of our trace):

$$ g_8 = g^{2^{29}} $$

Introducing constraint enforcement domains

A key identity in STARKs is $$x^n - 1 = \prod_{i=0}^{n-1} (x - h^i)$$, where $h$ is the generator of any group whose size is a power of 2. To make a long story short, if the enforcement domain of a transition constraint is $x^n - 1$, then the transition constraint applies to the domain points $\{h^0, h^1, ..., h^{n-1}\}$. We will make use of this identity to lower the number of constraints to $\log(n)$ (where $n$ is the length of the trace).

As a reminder the transition constraint formula is as follows (rewritten slightly for readability). For $\kappa \in \{1, ..., \upsilon \}$,

$$ c(h^{2^{\upsilon - \kappa}} x) = \frac{r{\upsilon - \kappa}}{1 - r{\upsilon - \kappa}} c(x) $$

for $$x \in \{ h^0, ..., h^{2^{\kappa - 1}}\},$$ and $h$ is the generator of that subgroup.

Let's build our intuition of why this works by going back to our $\upsilon = 3$ example. We have 3 constraints over the following domains:

$$ \begin{align} \kappa = 1 & \rightarrow \{(g_8)^0 \}, \ \kappa = 2 & \rightarrow \{(g_8)^0, (g_8)^4 \}, \ \kappa = 3 & \rightarrow \{(g_8)^0, (g_8)^2, (g_8)^4, (g_8)^6 \} \ \end{align} $$

These domains are derived as described in the previous section.

Let's evaluate the transition constraint formula for $\kappa=1$, where the enforcement domain is $\{(g_8)^0 \}$. If we plug in to the formula $x=(g_8)^0$ and $\kappa=1$, we get

$$ c_4 = \frac{r_2}{1-r_2} c_0 $$

Notice that this is the 5th constraint defined in the previous section!

For $\kappa=2$, the enforcement domain is $\{(g_8)^0, (g_8)^4 \}$. If we plug in to the formula $x=(g_8)^0$ and $\kappa=2$, we get

$$ c_2 = \frac{r_1}{1-r_1} c_0 $$

and with $x=(g_8)^4$, we get

$$ c_6 = \frac{r_1}{1-r_1} c_4 $$

our 3rd and 7th constraints!

For $\kappa=2$, the enforcement domain is $\{(g_8)^0, (g_8)^2, (g_8)^4, (g_8)^6 \}$; I'll leave it as an exercise to show that we recover our last 4 transition constraints.

Al-Kindi-0 commented 7 months ago

Sometimes it may be useful to test summations different from $0$. This means that instead of focusing on the relation $$\sum{z\in\lbrace 0, 1\rbrace^{\nu}}\sum{x\in\lbrace 0, 1\rbrace^{\mu}} \frac{f\left(z, p_0(x),\dots,p_v(x)\right)}{g\left(z, p_0(x),\dots,pv(x)\right)} = 0$$ we look at $$\sum{z\in\lbrace 0, 1\rbrace^{\nu}}\sum_{x\in\lbrace 0, 1\rbrace^{\mu}} \frac{f\left(z, p_0(x),\dots,p_v(x)\right)}{g\left(z, p_0(x),\dots,p_v(x)\right)} = C$$

Examples of these are the virtual tables of stack overflow and kernel procedures. In the first example, the VM can accommodate programs which start with more than 16 field elements by initializing the stack overflow virtual table with these elements. Since this initialization is a function only of the (overflow) stack elements and their positions on the stack, the verifier can initialize the virtual table on its own and using randomness, say $\alpha$, collapse it into a single value, say $C\left(\alpha\right)$. For this more general relation, the analysis in the original post goes through unaltered except for the change to the summation value.

More precisely:

  1. The prover sends commitments to the main trace columns.
  2. The verifier sends the challenge vector $\alpha$ (to reduce global constraints to local constraints).
  3. Both the prover and verifier compute the expected sum $C(\alpha)$
  4. The prover engages with the verifier to convince him that $$\sum{z\in\lbrace 0, 1\rbrace^{\nu}}\sum{x\in\lbrace 0, 1\rbrace^{\mu}} \frac{f\left(z, p_0(x),\dots,p_v(x)\right)}{g\left(z, p_0(x),\dots,p_v(x)\right)} = C(\alpha)$$ using the GKR protocol for sums of fractions.

Note that, in the case of a universal bus, $C(\alpha)$ will in fact be the sum of many individual $C_i(\alpha)$ each corresponding to a particular bus i.e., in the case of the above examples, we will have $C_0(\alpha)$ for the overflow stack virtual table (bus), $C_1(\alpha)$ for the kernel procedures virtual table (bus) and $C(\alpha) = C_0(\alpha) + C_1(\alpha)$.