powdr-labs / powdr

A modular stack for zkVMs, with a focus on productivity, security and performance.
Apache License 2.0
392 stars 80 forks source link

Instruction delegation #1077

Closed chriseth closed 2 months ago

chriseth commented 7 months ago

Currently, we have no way to "invoke" an instruction from another instruction. This is especially important when an instruction uses a permutation, since we cannot just duplicate the called instruction's code in the calling instruction.

To solve this, the following syntax is proposed (the memory example is bad, because the memory machine can probably not be implemented like that, but also see the binary example below):

instr madd X, Y -> Z: mop, WVAL <== wrap(RVAL1 + RVAL2) {
    RADDR1 = X,
    RADDR2 = Y,
    WADDR = Z,
}

instr mop {
    { RADDR1, RADDR2, WADDR, STEP, RVAL1, RVAL2, WVAL } is 1 { m_raddr1, m_raddr2, m_waddr, m_step, m_rval1, m_rval1, m_wval }
}

Here, the madd instruction calls both the mop and the wrap instruction. In doing this, we can read two values from memory, add them (including wrapping the result) and write the result to memory again in the same step. This allows us to even use the same memory address for the written value since the memory machine knows that in one step, the write always comes last.

Since the mop instruction does not have parameters, it is invoked as is. The wrap instruction has parameter and thus also needs two additional assignment registers different from X, Y and Z. In assembly-to-pil conversion, whenever we set the flag for madd, we also set the flags for map and wrap (note that this could create problems with the "else" condition in the register update constraint).

As an alternative, we could just allow the instructions in the body of the instruction definition, but I'm not sure how this would later work with translation via PIL.

Binary example:

instr add X, Y -> Z: binary(1, X, Y) {
}

instr mul X, Y -> Z: binary(2, X, Y) {
}

instr binary op: integer, X, Y -> Z {
    { op, X, Y, Z } in { bin.op, bin.A, bin.B, bin.C }
}
georgwiese commented 7 months ago

What would the binary example compile to? I can think of 3 ways:

1:

instr_add { 1, X, Y, Z } in bin.LATCH { bin.op, bin.A, bin.B, bin.C }
instr_mul { 2, X, Y, Z } in bin.LATCH { bin.op, bin.A, bin.B, bin.C }

(Like now, just a different way to write it)

2:

(instr_add + instr_mul) { instr_add * 1 + instr_mul * 2, X, Y, Z } in bin.LATCH { bin.op, bin.A, bin.B, bin.C }

(Saves one lookup)

3:

instr_binary { binary_op, X, Y, Z } in bin.LATCH { bin.op, bin.A, bin.B, bin.C }

(Where instr_binary & instr_op come from the instruction lookup, reduces the number of witness columns in the main machine (for more than 2 operations), this is also what Polygon does).

I think we should aim for the 3rd translation! But is this easy to infer? If not, it might be better to have a more direct syntax?

chriseth commented 7 months ago

I had the last option in mind and don't think it's complicated.

georgwiese commented 4 months ago

the memory example is bad, because the memory machine can probably not be implemented like that

Let me try to fix it:

instr madd addr_arg1, addr_arg2, addr_res ->: mload_arg1, mload_arg2, mstore_res, val_res <== wrap(val_arg1 + val_arg2) {}

instr wrap wrap_Y -> wrap_X { wrap_Y = wrap_X + wrap_bit * 2**32, wrap_X = X_b1 + X_b2 * 0x100 + X_b3 * 0x10000 + X_b4 * 0x1000000 }
instr mload_arg1 addr_arg1 -> val_arg1 ~ memory.mload addr_arg1, STEP -> val_arg1;
instr mload_arg2 addr_arg2 -> val_arg2 ~ memory.mload addr_arg2, STEP -> val_arg2;
instr mstore_res addr_res, val_res -> ~ memory.mstore addr_res, STEP, val_res ->;

This would compile to 4 permutations (1 for the wrap call + the 3 memory permutations). The memory operations need to be separate permutations, because they happen at the same time, but they can be re-used by other instructions. This would work with the current memory machine in STD, as long as the input and output addresses are different.

Did I understand this right?

One question I have about this: Does that mean that we allow for assignments between assignment registers? I this case, we'd have to set wrap_Y = val_arg1 + val_arg2, all of which are assignment registers. Is this feasible? What if there is a chain of 5 instructions that are executed, wouldn't the number of necessary assignment registers explode?

georgwiese commented 4 months ago

One question I have about this: Does that mean that we allow for assignments between assignment registers? I this case, we'd have to set wrap_Y = val_arg1 + val_arg2, all of which are assignment registers. Is this feasible? What if there is a chain of 5 instructions that are executed, wouldn't the number of necessary assignment registers explode?

How about this:

instr madd addr_arg1, addr_arg2, addr_res ->: mload_arg1, mload_arg2, mstore_res {
    val_res = val_arg1 + val_arg2
}
instr mwrap addr_arg1, addr_res ->: mload_arg1, mstore_res, wrap {}

instr wrap val_arg1 -> val_res { val_arg1 = val_res + wrap_bit * 2**32, val_res = X_b1 + X_b2 * 0x100 + X_b3 * 0x10000 + X_b4 * 0x1000000 }
instr mload_arg1 addr_arg1 -> val_arg1 ~ memory.mload addr_arg1, STEP -> val_arg1;
instr mload_arg2 addr_arg2 -> val_arg2 ~ memory.mload addr_arg2, STEP -> val_arg2;
instr mstore_res addr_res, val_res -> ~ memory.mstore addr_res, STEP, val_res ->;

So the program has to run two instructions instead of one, but there are a lot fewer columns. I think allowing something like val_res <== wrap(val_arg1 + val_arg2) introduces complexity we might not need (i.e., allowing to assign from one assignment register to the other).

georgwiese commented 4 months ago

As discussed with @pacheco and @chriseth, as an alternative to instruction delegation, we can support links inside instruction definitions. To rewrite Chris' example:

instr add X, Y -> Z {
    link 1 ~> bin.add X, Y -> Z
}
instr sub X, Y -> Z {
    link 1 ~> bin.sub X, Y -> Z
}

// Step 1: Make all constraints conditional on the instruction being active
link instr_add ~> bin.add X, Y -> Z
link instr_sub ~> bin.sub X, Y -> Z

// Step 2: Expand links
instr_add {0, X, Y, Z } is <sel> {bin.operation_id, bin.arg1, bin.arg2, bin.res};
instr_sub {1, X, Y, Z } is <sel> {bin.operation_id, bin.arg1, bin.arg2, bin.res};

// Step 3: Combine permutations / lookups if RHS is the same
(instr_add + instr_sub) {0 * instr_add + 1 * instr_sub, X, Y, Z } is <sel> {bin.operation_id, bin.arg1, bin.arg2, bin.res};
chriseth commented 2 months ago

Implemented in https://github.com/powdr-labs/powdr/pull/1439