a16z / jolt

The simplest and most extensible zkVM. Fast and fully open source from a16z crypto and friends. ⚡
https://jolt.a16zcrypto.com
MIT License
679 stars 146 forks source link

Indices explanation. #39

Closed wyattbenno777 closed 7 months ago

wyattbenno777 commented 1 year ago

Hi,

I was looking into the benchmarks and subtables examples. Thank you for the great work 🙏

Indices are the actually lookups? If so, a brief explanation on how they work would be helpful. For example using "And(2^128, 2^4)" how would one see that the specific value [1010] is in the table?

i.e. In a zkVM you could take as input [0,1] and [1,0] AND them and get [0,0]. Using Lasso lib, how would you check that [0,0] is in the table?

https://github.com/a16z/Lasso/blob/b0db15bb83c81912ba514b614e60b4d3f5f42e7e/src/benches/bench.rs#L13

It is also used here as x_indices and y_indices with no explanation of what these things are.

https://github.com/a16z/Lasso/blob/b0db15bb83c81912ba514b614e60b4d3f5f42e7e/src/subtables/and.rs#L126

wyattbenno777 commented 1 year ago

`

[test]

fn valid_merged_poly() { const C: usize = 2; const M: usize = 1 << 4;`

let x_indices: Vec<usize> = vec![15]; let y_indices: Vec<usize> = vec![15]; let subtable_evals: Subtables<Fr, C, M, AndSubtableStrategy> = Subtables::new(&[x_indices, y_indices], 1);

`// Real equation here is log2(sparsity) + log2(C)
let combined_table_index_bits = 1;

for (x, expected) in vec![
  (0, 0b11), 
  (1, 0b11),
] {
  let calculated = subtable_evals
    .combined_poly
    .evaluate(&index_to_field_bitvector(x, combined_table_index_bits));
  println!("calculated: {:?}", calculated);
  println!("Fr::from(expected): {:?}", Fr::from(expected));
  assert_eq!(calculated, Fr::from(expected));
}

}`

It seems like the indices are just picking values in the materialized subtable.

[ [
BigInt([0, 0, 0, 0]), //00 & 00
BigInt([0, 0, 0, 0]), //00 & 01
BigInt([0, 0, 0, 0]), //00 & 10
BigInt([0, 0, 0, 0]), //00 & 11
BigInt([0, 0, 0, 0]), //01 & 00
BigInt([1, 0, 0, 0]), //01 & 01
BigInt([0, 0, 0, 0]), //01 & 10
BigInt([1, 0, 0, 0]), //01 & 11
BigInt([0, 0, 0, 0]), //10 & 00
BigInt([0, 0, 0, 0]), //10 & 01
BigInt([2, 0, 0, 0]), //10 & 10
BigInt([2, 0, 0, 0]), //10 & 11
BigInt([0, 0, 0, 0]), //11 & 00
BigInt([1, 0, 0, 0]), //11 & 01
BigInt([2, 0, 0, 0]), //11 & 10
BigInt([3, 0, 0, 0]) //11 & 11
] ]

i.e. BigInt([3, 0, 0, 0]) is chosen by vec![15];

Is this materializing the full subtable?

For example in the bench:

single_pass_lasso!( "And(2^128, 2^10)", Fr, EdwardsProjective, AndSubtableStrategy, /* C= */ 8, /* M= */ 1 << 16, /* S= */ 1 << 10 ), C (8 tables) of M (65536 elements) = 65536 ^ 8 ( or 2^128)

If one makes 1 << 16, and has C = 8, does this really equate into a lookup in the full 2^128 table?

If we turned C to 4 in ADD above element 15 would always be BigInt([3, 0, 0, 0]) independent of the indice lookup . Or another example:

M = 1 << 4 with C = 2 should be a table of size 2^5 (two tables of 16 elements or 32 elements)

However the second search y_indices below would be out of bounds. let x_indices: Vec<usize> = vec![15];
let y_indices: Vec<usize> = vec![25]; Which means no matter what we can only search up to //11 & 11 .

You would need to materialize the full table with the following to be able to search for vec![25];. const M: usize = 1 << 5; // 2^5

Explanation of how this is meant to work would be great.

wyattbenno777 commented 1 year ago

I think I see how this lib works now.

In the benchmarks : And(2^128, 2^10) this is actually not true right? The add table materializes this full tableM= */ 1 << 16, as one single table.

So currently, you can do very fast lookups into small tables.

https://a16zcrypto.com/posts/article/introducing-lasso-and-jolt/ "Moreover, Lasso applies even to gigantic tables (say of size 2^128), as long as the tables are “structured” (in a precise technical sense). These tables are far too large for anyone to explicitly materialize, but Lasso pays only for table elements that it actually accesses. In particular – and in contrast to prior lookup arguments – if the table is structured, then no party needs to cryptographically commit to all of the values in the table.

There are two different notions of structure that Lasso exploits: decomposability and LDE-structure. (LDE is short for a technical notion called a low-degree extension polynomial.) Decomposability roughly means that a single lookup into the table can be answered by performing a handful of lookups into much smaller tables. This is a more stringent requirement than LDE-structure, but Lasso is especially efficient when applied to decomposable tables. "

The above is WIP.

wyattbenno777 commented 1 year ago

i.e. Devs need to do multiple lookups into a subtable and concat the results together. This make a lot of sense with something like AND.

moodlezoup commented 1 year ago

I'm not sure I'm entirely following your question @wyattbenno777 but the "concatenation" is handled by the combine_lookups function: https://github.com/a16z/Lasso/blob/master/src/subtables/mod.rs#L45-L46

At a high level, this is how it works: say you're looking up AND(0101010101010101, 0000000011111111), i.e. the "big" lookup table is size 2^32 (two 16-bit operands)

Lasso instead performs lookups into the smaller AND subtables, and combine the results: AND(0101010101010101, 0000000011111111) = 2^24 AND(0101, 0000) + 2^16 AND(0101, 0000) + 2^8 * AND(0101, 1111) + AND(0101, 1111)

Specifically, Lasso performs lookups into the AND subtable at index 01010000, 01010000, 01011111, 01011111 corresponding to the four terms in the above weighted sum

Note that the primary sumcheck operates over the combined polynomial (https://github.com/a16z/Lasso/blob/master/src/lasso/surge.rs#L159-L171 note that S::combine_lookups_eq is provided as an argument) so Lasso does in fact prove a lookup into the big AND table

wyattbenno777 commented 1 year ago

Thank you!

The above is me mostly working that out without a readme. The bench.rs seems to imply a very large table without using one. "And(2^128, 2^10)"

Let me rephrase the question a bit as it still is not super clear. Using code how would one do this specific search? AND(0101010101010101, 0000000011111111)

fn combine() { // what goes here? }

moodlezoup commented 1 year ago

It would basically look like the single_pass_lasso except nz would just contain a single element (a C-sized array encoding the index 01010101010101010000000011111111). So for C = 8, it would look something like:

let nz = vec!([ [0b0101, 0b0101, 0b0101, 0b0101, 0b0000, 0b0000, 0b1111, 0b1111] ])

There is no need to implement the combine_lookups function, it's already implemented in and.rs

Thanks for these questions, this will help us improve our documentation going forward

wyattbenno777 commented 1 year ago

"There is no need to implement the combine_lookups function, it's already implemented in and.rs"

Until we want to make our own tables and implement our own combine_lookups. Docs help speed this up a bit :)

Last question:

How would I do that search as a test in and.rs? (AND(0101010101010101, 0000000011111111))

Or subtables/mod.rs

wyattbenno777 commented 1 year ago

Messing around with making a basic AND example in benchmark.rs:

let nz = let nz = vec!([[0b1111100010100000, 0b1111]]);

This results in a claimed eval: BigInt([160, 0, 0, 0])

If I change it instead to: let nz = let nz = vec!([[0b1111, 0b0100]]);

The result is claimed eval BigInt([0, 0, 0, 0])

AND(1111, 0100) should result in 4 or 0b0100.

What is the most basic example that can be given showing a lookup argument for BITWISE AND and its result?

*Note: actually getting the result after the AND would be important for zkVM that use the result in the next opcode.

wyattbenno777 commented 1 year ago

DISSECTING:

How would you check if AND(1111, 0100) is in the lookup table?

In the above it was noted you could do it in the way that benchmark.rs does it: let nz = let nz = vec!([[0b1111, 0b0100]]);

This becomes two sparse matrix which are later made into dense representation. They are also grouped by index.

[[15,15], [4, 4]]

The actual lookup is done with indices in fn to_lookup_polys(

Each number is used as the index looking into the table.

In this size table: /* M= */ 1 << 16,

15 = BigInt([0, 0, 0, 0]) in the lookup subtable. As //00000000 & 00010111 is what is actually being looked up. i.e. AND(0, 23) is what is actually being looked up.

4 = BigInt([0, 0, 0, 0]) in the lookup subtable. //00000000 & 00000011

This is why the evals in the comment above were off. The numbers specified are just indexes not the specific AND(X, Y). Now hopefully the naming indices makes a bit more sense.

So how do you do AND(15, 4)??

i.e. How do you find the index where AND(1111, 0100) 00001111 & 0000100..

Wild guess 256 * 12 for the left side = 3072 //00001111 plus 4 for the right side (0100) plus 1 for the index starting at 0.

Subtable at index 3077 does indeed return BigInt([4, 0, 0, 0]) 4 is the correct result for AND(1111, 0100)

There should be test to check that the subtable is actually being created correctly.

index 1: BigInt([0, 0, 0, 0]), //00000000 & 00000000 index 256: BigInt([0, 0, 0, 0]), //00000000 & 11111111 index 512: BigInt([1, 0, 0, 0]), //0000001 & 11111111 index 768: BigInt([2, 0, 0, 0]), //00000010 & 11111111 index 1024: BigInt([3, 0, 0, 0]), //0000011 & 11111111 index 1280: BigInt([0, 0, 0, 0]), //00000100 & 11111111

Why does it deviate at 1280? It should be BigInt([4, 0, 0, 0])??

moodlezoup commented 1 year ago

Ok I actually found a bug in the comine_lookups function for the bitwise ops while working through this 😅 Basically it wasn't doing the concatenation correctly, see fix here afaict it's working on that branch, this is what I'm running to test: https://gist.github.com/moodlezoup/14cb15300e658c72806181cb21355861

moodlezoup commented 1 year ago

It would basically look like the single_pass_lasso except nz would just contain a single element (a C-sized array encoding the index 01010101010101010000000011111111). So for C = 8, it would look something like:

let nz = vec!([ [0b0101, 0b0101, 0b0101, 0b0101, 0b0000, 0b0000, 0b1111, 0b1111] ])

I was wrong here, the operands need to be interleaved so it would actually be like let nz = vec!([ [0b0100, 0b0100, 0b0100, 0b0100, 0b0111, 0b0111, 0b0111, 0b0111] ])

wyattbenno777 commented 1 year ago

This is correct now

index 1: BigInt([0, 0, 0, 0]), //00000000 & 00000000 index 256: BigInt([0, 0, 0, 0]), //00000000 & 11111111 index 512: BigInt([1, 0, 0, 0]), //0000001 & 11111111 index 768: BigInt([2, 0, 0, 0]), //00000010 & 11111111 index 1024: BigInt([3, 0, 0, 0]), //0000011 & 11111111 index 1280: BigInt([4, 0, 0, 0]), //00000100 & 11111111 ... so on index 3840: BigInt([14, 0, 0, 0]), //00001110 & 11111111

i.e. 256 * 15 = 3840 //15 is 1111 plus 4 for the right side (0100) plus 1 for the index starting at 0.

index 3845: BigInt([4, 0, 0, 0]), //00001111 & 00000100

AND(1111, 0100) = 0100 and it found at index 3845

wyattbenno777 commented 1 year ago

Adding this gist thread here as I think it adds insights into indices for subtables.

https://gist.github.com/moodlezoup/14cb15300e658c72806181cb21355861