Finalize proof serialization format

ebfull commented 8 years ago

The serialization of the zkSNARK proof that we currently use is 584 bytes and unspecified. (See https://github.com/zcash/zips/issues/43 for more.) We need to settle on this before 1.0.

There are many reasons why we cannot just enable point compression:

libsnark's sqrt implementation (needed for point decompression) needs to be bounded. See https://github.com/scipr-lab/libsnark/issues/41.
The proof format is still unusual and inefficient. We use an extra byte per group element to store whether or not it's zero (which is probably never for valid proofs?), in ASCII decimal no less, and we also use a byte to store the LSB of the y coordinate for each point as well. 32*7 + 64 + 8 + 8 = 304 bytes.
libsnark's point compression can only be enabled across the board, which makes decompression of the proving key extremely expensive (several minutes on my machine).

We can solve all of these issues with a custom serialization class. It will also help us be much more confident that our verifier isn't being abused.

Another sticky issue is portability. https://github.com/scipr-lab/libsnark/issues/26 We're only targetting 64 bit architectures for 1.0, but I'm sure later on we can piece together a story for this. We may want to disable montgomery representations in the serialization later though.

daira commented 8 years ago

"We use an extra byte per group element to store whether or not it's zero (which is probably never for valid proofs?), in ASCII decimal no less, and we also use a byte to store the LSB of the y coordinate for each point as well. 32*7 + 64 + 8 + 8 = 304 bytes."

Sigh. I can see why the byte for the LSB is there, but that byte already gives a standard (IEEE Std 1363-2000) way to encode the point at infinity.

daira commented 8 years ago

Personally I like the idea (I think it was @ebfull's) of using separate functions for serialization rather than trying to cajole the stream operators to do what we want; the latter are not designed to support different serialization modes.

daira commented 8 years ago

Did you discover the cause of the difference in serialization between BN128 and ALT_BN128?

daira commented 8 years ago

I believe the 288 bytes in the Zerocash and BCTV14 papers must be a mistake; the proof requires a minimum of 289 bytes with the 8 y-coordinate LSBs (more realistically 296 bytes using the IEEE Std 1363-2000 style encoding, which is what I'd prefer to use).

ebfull commented 8 years ago

At least information theoretically, it's possible. Remember that the x-coordinate of each G1 element is an element of Fq, which is mod ~2^254. You could toss in the y-coordinate's LSB in those spare bits.

ebfull commented 8 years ago

@daira said:

Did you discover the cause of the difference in serialization between BN128 and ALT_BN128?

I didn't, but based on the code I think the best explanation is that there was a mismatch in use of the montgomery form, which is probably fixable.

daira commented 8 years ago

At least information theoretically, it's possible. Remember that the x-coordinate of each G1 element is an element of Fq, which is mod ~2^254. You could toss in the y-coordinate's LSB in those spare bits.

I stand corrected. I still think we should use the IEEE format.

daira commented 8 years ago

The proposed format, intended to be compatible with IEEE Std 1363a-2004, is described in section 6 of https://github.com/zcash/zips/blob/zips27.reorganisation.0/protocol/protocol.pdf

daira commented 8 years ago

I need to find out what the bound is for F_q^2 to unblock @ebfull .

zcash / zcash

Finalize proof serialization format #1103