Permutation check syntax

0xPolygonMiden / air-script

A domain-specific language for writing AIR constraints for STARKs

MIT License

75 stars 11 forks source link

Permutation check syntax #183

Open bobbinth opened 1 year ago

bobbinth commented 1 year ago

Frequently, when describing constraints for multiset checks, we have expressions like this:

$$ p_0' \cdot (a \cdot f_0 + b \cdot f_1 + (c + d) \cdot f_2 + 1 - (f_0 + f_1 + f_2)) = p_0 \cdot (x \cdot f_3 + y \cdot f_4 + 1 - (f_3 + f_4)) $$

Where $p_0$ is the running product column, $f_0, ..., f_4$ are binary selector flags, $a, b, c, d$, are values which are to be "divided out" of $p_0$ when relevant flags are $1$, and $x, y$ are values which are to be multiplied into $p_0$ when relevant flags are $1$. Flags on the left side are usually mutually exclusive and flags on the right side are mutually exclusive as well.

The idea is that the left-hand-side of this expression reduces as follows:

Flag	Expression
$f_0 = 1$	$p_0' \cdot a$
$f_1 = 1$	$p_0' \cdot b$
$f_2 = 1$	$p_0' \cdot (c + d)$

And analogously, on the right-hand side the expression reduces as follows:

Flag	Expression
$f_3 = 1$	$p_0 \cdot x$
$f_4 = 1$	$p_0 \cdot y$

The expression at the very top looks rather complex and error prone - so, I wonder if we can introduce special syntax to make such expressions simpler. For example:

enf join p0:
  (a, b, c + d) when (f0, f1, f2)
  on
  (x, y) when (f3, f4)

Or maybe something like this is better for cases when instead of $a$, $b$, $c$ etc. we want to use more complex expressions:

enf join p0:
  a when f0
  b when f1
  c + d when f2
  on
  x when f3
  y when f4

damip commented 7 months ago

What would you think of making the existing match behave not only like a constraint selector when used with constraints like it is done currently:

enf match {
    case f0: p0' * a = c,
    case f1: p0' * b = d,
};

but also like a value when used with values in its branches:

enf g' =  d + match {
    case f0: p0' * a,
    case f1: p0' * b,
} + f;

which solves the issue at hand in an intuitive way without adding new keywords and syntaxes:

enf match {
    case f0: p0' * a,
    case f1: p0' * b,
    case f2: p0' * (c+d)
} = match {
   case f3: p0 * x,
   case f4: p0 * y
};

which can be factored further:

enf match p0' * {
    case f0: a,
    case f1: b,
    case f2: c+d
} = p0 * match {
   case f3: x,
   case f4: y
};

In that case, mixing values and constraints in the case branches of a single match should be forbidden by the parser.

Note that the proposed behavior reminds the one used by Rust: https://doc.rust-lang.org/rust-by-example/flow_control/match.html

bobbinth commented 6 months ago

@damip - thank you for writing this up (and sorry for such a delayed reply)! The suggestions make sense, but with the changes we are working though in https://github.com/0xPolygonMiden/miden-vm/pull/1307, I'm thinking we might make more radical changes to the AirScript.

Specifically, we can define a concept of a "universal bus" and this bus would contain all tables/buses for a given AIR. On the front end of AirScript, we'd remove the concept of auxiliary traces and random values, and make a table construct a first-class citizen.

Here is a preliminary example of how this might look like:

def ExampleAir

trace_columns: [
    s, a, b, c
]

public_inputs {
    stack_inputs: [16],
    stack_outputs: [16],
    stack_overflow: [[2]],
}

periodic_columns {
    k0: [1, 1, 1, 1, 1, 1, 1, 0]
}

tables {
    p: [[2]]
}

boundary_constraints {
    # define boundary constraints against the main trace at the first row of the trace.
    enf a.first = stack_inputs[0]
    enf b.first = stack_inputs[1]
    enf c.first = stack_inputs[2]

    # define boundary constraints against the main trace at the last row of the trace.
    enf a.last = stack_outputs[0]
    enf b.last = stack_outputs[1]
    enf c.last = stack_outputs[2]

    # table p must be empty at the start of the execution
    enf p.first = null

    # at the end of the execution, table p must be equal to the specified public table
    enf p.last = stack_overflow
}

integrity_constraints {
    # the selector must be binary.
    enf s^2 = s

    # selector should stay the same for all rows of an 8-row cycle.
    enf k0 * (s' - s) = 0

    # c = a + b when s = 0.
    enf (1 - s) * (c - a - b) = 0

    # c = a * b when s = 1.
    enf s * (c - a * b) = 0

    # add a row to the table when s == 1
    p.add(a, b) when s

    # remove a row from the table when s == 0
    p.rem(c, d) when (1 - s)
}

To describe the most relevant parts of the above example in more detail:

First, we introduce tables section. Here we can define one or more tables like so:

tables {
    p0: [[2]],
    p1: [[3]],
}

The above defines two tables: one with 2 columns and the other one with 3 columns. We could also optionally specify a domain value for each table, but I'll omit it for now for brevity.

Then, in the boundary constraints section, we can specify boundary constraints for the defined tables. These would basically define values with which the table could start/end. Here is an example:

boundary_constraints {
    enf p.first = null
    enf p.last = null
}

The above states that the table must be empty at the start and at the end of the execution trace.

In some situations, it may be useful to have non-empty tables at start/end of the execution trace. Following the syntax discussed in #143, we can do something like this:

public_inputs: {
  stack_overflow_init: [[3]]
  stack_overflow_final: [[3]]
}

boundary_constraints {
    enf p.first = stack_overflow_init
    enf p.last = stack_overflow_final
}

Finally, in the integrity constraints section, we can add or remove rows from the defined tables like so:

integrity_constraints {
    # add a row to the table when s == 1
    p.add(a, b) when s

    # remove a row from the table when s == 0
    p.rem(c, d) when (1 - s)
}

This is just a proposed syntax and we could have some variations on it. For example, maybe we should still use enf keyword and then it could look something like this:

enf add (a, b) into p when s
enf rem (c, d) from p when 1 - s

The when clause is optional and also we may want to have a way to specify multiplicities rather than just binary selectors. For example: p.add(a, b) times s would be identical to p.add(a, b) when s for $s$ values $0$ and $1$, but could also extend to $s$ values grater than $1$ (though, I don't love the times keyword).

The above describes just the front-end changes. How this will affect the IR (i.e., the AlgebraicGraph) is still something that needs to be thought through.

cc @Al-Kindi-0 and @plafer.

Al-Kindi-0 commented 5 months ago

The proposal for the front-end looks good, maybe some terminology change might be a bit more helpful but it is OK as a work-in-progress. Regarding the back-end and especially with regards to the AlgebraicGraph, I think it might be helpful to see what will be needed to support a LogUp-GKR based universal bus to connect all components and accommodate look-ups.

To this end, we can look at the smallest example which exposes AFAICS all the novelties of this approach. Using the syntax above, consider this:

integrity_constraints {
    p.add(a) when s
    p.add(b) when t

    p.rem(c) when (1 - s)
    p.rem(d) when (1 - t)
}

In the language of LogUp, this will translate to the following mathematical equation:

$$ \sum_{i = 0}^n \frac{s(i)}{\alpha_0 + \alpha_1\cdot a(i)} + \frac{t(i)}{\alpha_0 + \alpha1\cdot b(i)} = \sum{i = 0}^n \frac{1 - s(i)}{\alpha_0 + \alpha_1\cdot c(i)} + \frac{1 - t(i)}{\alpha_0 + \alpha_1\cdot d(i)} $$

or equivalently

$$ \sum_{i = 0}^n \frac{s(i)}{\alpha_0 + \alpha_1\cdot a(i)} + \frac{t(i)}{\alpha_0 + \alpha_1\cdot b(i)} + \frac{s(i) - 1}{\alpha_0 + \alpha_1\cdot c(i)} + \frac{t(i) - 1}{\alpha_0 + \alpha_1\cdot d(i)} = 0 $$

where $s(i)$ denotes the i-th entry of column $s$ and similarly for the remaining columns. This is called the LogUp relation[^1] albeit it being only for a simpler multi-set check relation. We therefore call the $\alpha$-s the LogUp randomness. These are the challenges for the LogUp relation sent by the verifier after it receives the main trace commitment.

To check the LogUp relation, we can use GKR and this entails that the verifier will execute, during multiple executions of sum-check, a bunch of polynomial evaluations at some random challenges as well as a bunch of hashing to generate these challenges. There are also some algebra that needs to be done between each of the above sum-checks (i.e., between each consecutive GKR layers) but these are very simple, the most challenging is the final GKR check which encodes the algebra of the bus we are proving. In our case this will look like this:

$$ p_0\cdot q_1+ p_1 \cdot q_0 + \lambda\cdot q_0\cdot q_1 \overset{?}{=} \textsf{claim} $$

$$ \begin{aligned} p_ 0 &:= \left(1 - \rho_0\right)\cdot\textsf{s}+ \rho_0 \cdot \left(\textsf{s} - 1\right) \ p_1 &:= \left(1 - \rho_0\right)\cdot\textsf{t}+ \rho_0 \cdot \left(\textsf{t} - 1\right) \ q_0 &:= \left(1 - \rho_0\right)\cdot\left(\alpha_0 + \alpha_1\textsf{a}\right) + \rho_0 \cdot \left(\alpha_0 + \alpha_1\textsf{c}\right) \ q_1 &:= \left(1 - \rho_0\right)\cdot\left(\alpha_0 + \alpha_1\textsf{b}\right) + \rho_0 \cdot \left(\alpha_0 + \alpha_1\textsf{d}\right) \end{aligned} $$

where:

$\lambda$ is a randomness used by the GKR protocol to batch two sum-checks into one.
$\rho_0$ is part of the randomness generated during the GKR protocol.
$\textsf{s}$ is a value provided by the GKR prover and it is claimed to be equal to $\Sigma_{i = 0}^n s(i)\cdot L_r(i)$, where $L_r$ is the Lagrange kernel at $r := (r0, \cdots, r{\log(n) - 1})$ and $r$ is vector of challenges sampled during the final sum-check of GKR. $\textsf{s}$ can be thought of as the equivalent of an OOD evaluation but for the multi-linear interpretation of the $s$ column.
The same interpretation in the previous point applies to $\textsf{t}$, $\textsf{a}$, $\textsf{b}$, $\textsf{c}$, and $\textsf{d}$.

To complete the proof of the LogUp relation, an additional proof of the correctness of $\textsf{s}$, as well as the other claimed values, is needed. This is where the auxiliary trace comes into play i.e., giving a proof of the correctness of the claimed values. I think this part is pretty self-contained so we can discuss it separately in order to not over-crowd the discussion.

To conclude, the following are some questions and remarks:

It seems to me that, in the context of LogUp-GKR, the "algebra" is self contained. So it might make sense to isolate these at the level of the IR?
We haven't discussed bus initialization, but accommodating this in the above basically means changing $\cdots = 0$ in the LogUp relation to $\cdots = \Sigma_0^m \frac{f(...)}{g(...)}$ where the left-hand side involves algebraic expressions defining the buses and involving the initial values. From the GKR point of view, this will translate to the check $\textsf{CircuitOutput} = \Sigma_0^m \frac{f(...)}{g(...)}$ where $\textsf{CircuitOutput}$ is the output of the circuit evaluating the LogUp sum and is provided by the prover as part of the GKR proof.

[^1]: In the original LogUp paper, only the so-called multiplicity column is used in the numerators, in fact in one and only one numerator per look-up relationship, but this can be easily generalized to denominators being expressions of some trace columns acting as selectors.

bobbinth commented 5 months ago

integrity_constraints {
    p.add(a) when s
    p.add(b) when t

    p.rem(c) when (1 - s)
    p.rem(d) when (1 - t)
}
In the language of LogUp, this will translate to the following mathematical equation:

$$ \sum_{i = 0}^n \frac{s(i)}{\alpha_0 + \alpha_1\cdot a(i)} + \frac{t(i)}{\alpha_0 + \alpha1\cdot b(i)} = \sum{i = 0}^n \frac{1 - s(i)}{\alpha_0 + \alpha_1\cdot c(i)} + \frac{1 - t(i)}{\alpha_0 + \alpha_1\cdot d(i)} $$

I think there one thing missing from the above: we also need to add some unique value to identify table/bus $p$. Denoting this value as $p_{d}$, I think the expression would be:

$$ \sum_{i = 0}^n \frac{s(i)}{\alpha_0 + \alpha_1 \cdot p_d + \alpha_2\cdot a(i)} + \frac{t(i)}{\alpha_0 + \alpha_1 \cdot p_d + \alpha2\cdot b(i)} = \sum{i = 0}^n \frac{1 - s(i)}{\alpha_0 + \alpha_1 \cdot p_d + \alpha_2\cdot c(i)} + \frac{1 - t(i)}{\alpha_0 + \alpha_1 \cdot p_d + \alpha_2\cdot d(i)} $$

This way, if there is another table - e.g., $q$ - we'd use a different domain value for $q$ and this would avoid potential collisions.

It seems to me that, in the context of LogUp-GKR, the "algebra" is self contained. So it might make sense to isolate these at the level of the IR?

Agreed. I'm thinking we can make things quite generic with respect to tables. For example, we could break the constraints related to tables into several parts:

The numerator expressions (we can call these "gates" or "latches").
A set of table column expressions.

The struct representing this could look something like this:

pub struct Bus {
    /// A list of expression indexes in the algebraic graph defining columns of a given bus; these
    /// columns would be "joined" using randomness - i.e., alpha_0, alpha_1 to compute the denominators
    /// of the above expressions.
    columns: Vec<NodeIndex>,

    /// A list of expression indexes in the algebraic graph defining latches for a given bus; these 
    /// latches correspond to s, t, s - 1, and t - 1 expressions above.
    latches: Vec<NodeIndex>,
}

This approach would allow us to simplify the algebraic graph quite a bit. Specifically, it would contain expressions only over the base field (i.e., expressions involving columns from only the main segment). This means that the Value and TraceAccess enums would look like so:

pub enum Value {
    Constant(u64),
    TraceAccess(TraceAccess),
    PeriodicColumn(PeriodicColumnAccess),
    PublicInput(PublicInputAccess),
}

pub struct TraceAccess {
    pub column: TraceColumnIndex,
    pub row_offset: usize,
}

The place to keep track of buses could be the Constraints struct, which could look something like this:

pub struct Constraints {
    boundary_constraints: Vec<ConstraintRoot>,
    integrity_constraints: Vec<ConstraintRoot>,
    busses: Vec<Bus>,
    graph: AlgebraicGraph,
}

(note that we no longer need nested vectors for boundary/integrity constraints as these constraints are now always against the main trace segment).

The above is just a sketch, and it is possible I missed something. Also, it doesn't cover the initialization of tables (I still need to think it through).

plafer commented 5 months ago

I'll comment on the bus/table terminology, which will end up having some ramifications for the syntax towards the end.

As discussed a few times, I believe the bus & virtual table terminology could be improved. Finding the right naming/analogy for permutation checks ultimately will make it easier for users to write scripts, as the properties of the analogy can be used to reason about the properties of the underlying permutation checks. The properties for the analogy that we're after are:

starts in any state
items can be "added" or "removed"
items can be "removed" before they are "added"
must be "empty" at the end of the trace

The "virtual table" abstraction isn't quite right because tables typically don't have rows added or removed, and you certainly can't remove a row before it's been added. Similarly, when thinking about a bus, we typically use "requests" having matching "responses". However, receiving a response before sending the request doesn't really work. Also, we typically think of responses as containing new information not present in a request; in the case of the bus, the "request" contains the value that is typically in the "response", so that doesn't work very well either. Lastly, we have 2 different names (bus and virtual table) for the same abstraction (the permutation check), which starts to break when we say that "virtual tables" all use the "universal bus". Hence, I believe it would be best to find a single abstraction/analogy, and use it everywhere. It would require rewiring how we think about a few subsystems of the VM, but ultimately I think the benefits of using an analogy that has all 4 properties will outweigh the negatives of readjusting how we think about those subsystems.

Since in general, you don't typically "remove" before you "add", I suggest we think more of it in terms of items having an "opposite", and the abstraction will ensure that for every item, its opposite was also added. I was able to come up with 2 examples:

Double-entry bookkeeping ledger
1. The ledger can start with arbitrary numbers in the "credit" and "debit" columns
2. Amounts can be credited or debited
3. Amounts can be debited before they were credited (and vice-versa)
4. At the end of the trace, every amount credited must also have been debited
A double-pan balance scale
1. The scale can start with any number of items of either side
2. Items can be added to the left or to the right
3. Items can be added to the left before the right (and vice-versa)
4. At the end of the trace, each item must be added to both the left and the right (such that the scale is balanced out)

I'll use the ledger analogy moving forward, but the scale would work equivalently. So all virtual tables and buses would become "ledgers". For example, in the case of the block stack table in the decoder, we would "credit" a block when we start executing it, and "debit" that block when we're done. For the chiplets bus, the main trace would "credit" a message (e.g. representing 2 OR 3 = 3), and the bitwise chiplet would "debit" that same message once it is done computing it.

Note that by thinking of it as a ledger, it is immediately clear that the ledger itself (i.e. without any other tricks) isn't enough to guarantee that for the block stack table, every block must be credited before it is debited (since the ledger itself doesn't care about the order of credit/debit). And in the case of the interaction between the main trace and the bitwise chiplet, it is perfectly fine for the chiplet to debit the message before the main trace credits it to the ledger (which in this case, is a property we do want). I believe this makes it much easier to write (and audit!) constraints, as the properties of a real world ledger so closely match the properties of permutation checks (or more specifically a running sum/product column in AIR).

This has a minor ramification for the syntax: we no longer need to have syntax that reminds us of a table (as e.g. [[3]] would). It could be as simple as:

ledgers {
    block_stack: 3
    bitwise_chiplet_comm: 4
    hasher_chiplet_comm: 5
}

where 3,4, and 5 encode the number of Felts that make up a ledger item. But I haven't thought about the syntax as much; I just know that we can go with something less visually heavy than [[3]] while still "making sense".

bobbinth commented 5 months ago

Generally, agree with the above. A few comments:

starts in any state

items can be "added" or "removed"

items can be "removed" before they are "added"

must be "empty" at the end of the trace

The 4th property is actually not a requirement. In many cases it is true - but in some we can end up in an arbitrary state as well. This is, for example, the case with the StackOverflow table which can contain values at the end of the trace if the number of items on the stack is greater than 16 at the end of program execution.

I'll use the ledger analogy moving forward, but the scale would work equivalently. So all virtual tables and buses would become "ledgers".

I agree that table and bus are imperfect terms, but not sure ledger is a much better one as it is a new concept in this context. So, in my current order of preference I'd probably rank them as table, ledger, bus. The reasons why I'd prefer table over ledger:

"table" better captures the multi-column nature of what we are working with.
"lookup table" is also a fairly frequently used term for these things.

Another potential term would be just lookup. Not sure yet how I'd rank it.

ledgers {
    block_stack: 3
    bitwise_chiplet_comm: 4
    hasher_chiplet_comm: 5
}

One thing we may need to add to this is some kind of definition of a "label" or "domain" for each table. Basically, when everything is a part of the same "virtual table", we need to enforce domain separation between logical tables (e.g., not to mix up values from block_stack and `bitwise_chiplet_comm). This could be as simple as adding another implicit "column" to each table which would be set to a constant value. Setting this constant manually could be optional and if not set, we could assume that the constants are monotonically increasing.

For syntax standpoint, this would be quite similar to how enums work in Rust. So, maybe we could change the above to something like:

ledgers {
    block_stack(3) = 1,
    bitwise_chiplet_comm(4) = 2,
    hasher_chiplet_comm(5) = 3,
}

Where 3, 4, 5 indicate the number of columns in each table, and values 1, 2, 3 define the domain for each table.

The user would still be able to write:

ledgers {
    block_stack(3),
    bitwise_chiplet_comm(4),
    hasher_chiplet_comm(5),
}

And this would be equivalent to:

ledgers {
    block_stack(3) = 0,
    bitwise_chiplet_comm(4) = 1,
    hasher_chiplet_comm(5) = 2,
}

Al-Kindi-0 commented 5 months ago

I think the bus abstraction captures many of the existing properties while having the least amount of inconveniences:

To capture some of the quirks mentioned above (e.g., out-of-order sequence of requests-responses) we could name it non-deterministic bus and mention that it is the prover who fills some of the cells that goes into the requests. We can then explain some of the inconsistencies with the usual meaning of requests-responses between a client-server.
Adding a label/domain is equivalent to adding an address to route requests-responses specific to some (sub-)bus in the global bus.
A bus captures seamlessly the idea of sub-components interacting with one component.

To conclude, the construct we are trying to name is a bus in as much as a zkVM is VM and thus importing the analogy would make more sense, in my opinion, as long as we explain the limits of the analogy.

bobbinth commented 5 months ago

So, I guess we have 4 options: table, ledger, bus, lookup and the first 3 have 1 vote each :)

plafer commented 4 months ago

I agree that table and bus are imperfect terms, but not sure ledger is a much better one as it is a new concept in this context

Coming back to this, I agree that ledger isn't all that great for this reason.

"table" better captures the multi-column nature of what we are working with.

However, in the context of a hardware bus, I map the multiple columns to multiple wires/channels. So I think that the bus works quite well with multiple "columns".

"lookup table" is also a fairly frequently used term for these things.

Although this is true, I find referring to "communication between chiplets" using a lookup table to be awkward. And so I find the term works well for lookup tables (with multiplicities), but not the general case.

I think the bus abstraction captures many of the existing properties while having the least amount of inconveniences:

I currently agree with this the most. I would also rank bus as my top choice.

bobbinth commented 4 months ago

OK - bus it is! :)

But now we may need to have a couple variations of busses. The main reason is that, as far as I can tell, in LogUp-GKR context, we can't specify starting/ending value of a bus. So, while something like this would work for regular multiset and LogUp busses

buses {
    p: [[3]],
}

public_inputs {
  stack_overflow_init: [[3]],
  stack_overflow_final: [[3]],
}

boundary_constraints {
    enf p.first = stack_overflow_init;
    enf p.last = stack_overflow_final;
}

It won't work for LogUp-GKR buses.

From the syntax standpoint we could introduce "round" and flat" busses (we could also come up with better temrs) and it could look something like this:

buses {
    flat p0: [[3]],
    round p1: [[3]],
}

public_inputs {
  stack_overflow_init: [[3]],
  stack_overflow_final: [[3]],
}

boundary_constraints {
    enf p0.first = stack_overflow_init;
    enf p0.last = stack_overflow_final;

    # the + operator specifies that the bus is initialized with the concatenation
    # of the two tables
    enf p1.init = stack_overflow_init + stack_overflow_final;
}

Syntax aside, this construction is less uniform than what I initially would have liked, but it is also a bit more flexible. Now we'll be able to express systems which have both regular multiset/LogUp busses and LogUp-GKR buses.

bobbinth commented 4 months ago

Another option is to explicitly state the underlying methodology to implement the bus. For example, something like:

buses {
    gp p0: [[3]],
    logup p1: [[3]],
    gkr p2: [[3]],
}

gp here stands for "grand product".