privacy-scaling-explorations / halo2curves

Other
174 stars 141 forks source link

Serialization #176

Open adria0 opened 1 month ago

adria0 commented 1 month ago

from @davidnevadoc

In halo2curves, the serialization of fields could be improved. There are methods that support different: Endianness Limbs, Bytes, bits Motgomery form, non-montgomery form Some are implemented through traits like SerdeObject others are implemented in PrimeField, others are implemented directly on the field. It would be great to remove redundancies, estabilish a clear name convention and document the end result.

Related issue: #109

First round, initial idea

Halo2curves serializations comes with two flavours:

Notes:

Unserialize performance:

image

legend:

code: https://gist.github.com/adria0/c440185d765a368aaf21ca5741a63ab7

Initial proposal:

Comments: @davidnevadoc @duguorong009 @kilic ?

guorong009 commented 1 month ago

@adria0 First of all, thx for all research and proposal!

Here is my idea about your proposal:

  1. Remove SerdeObject

Why we need this? If we want to quick process data between trusted parties we have UncompresseEncoding::from_uncompressed_unchecked (that is in the order of magnitude of SerdeObject decodings), if not we can use GroupEncoding::from_bytes that fits into standards.

I think this SerdeObject is for field stuff, while UncompressedEncoding is for curve one. https://github.com/duguorong009/halo2curves/blob/b753a832e92d5c86c5c997327a9cf9de86a18851/derive/src/field/mod.rs#L485 Maybe, they do not bring any meaningful difference in underlying implementation. But, I think it's odd to implement UncompressedEncoding trait for any field.

  1. Follow Arkworks standard I think @davidnevadoc did some research on serialization in Arkworks, in previous works.

  2. Implement serde for Field, is not implemented I think SerdeObject is for this purpose. They help to convert field to bytes and vice versa.

  3. Rename EndianRepr::[from, to]_bytes Do we need this renaming? As you said, these are for internal use.

Overall, I think we first need documentation. The documentation, I mean, is both of one explaining all current traits(role, use, ...), and one of planning how to improve.

davidnevadoc commented 1 month ago

Thanks @adria0 for the in depth description of the issue!

I think we should tackle fields and curve serialization separately:

For curves:

For fields, let's stick to prime fields only for now ( field extensions can be represented in multiple ways and that is a bigger problem that exceeds the scope of this issue).

Lastly, I agree 100% with @guorong009 remark about documentation. Whichever design we land on, we should make an effort to have it well documented to prevent more confusion.

adria0 commented 1 month ago

Round two,

proposed changes

documentation

Field Encodings

Curve Encodings

Notes:

adria0 commented 1 month ago

Related https://github.com/privacy-scaling-explorations/halo2curves/pull/39

davidnevadoc commented 1 month ago

I tried to new version of the serialization where SerdeObject is removed in halo2, and found out that there are more serialization traits! SerdePrimeField for example, implemented in Halo2... Concretely, we should decide what to do with the parts of serialization implemented in halo2 (see https://github.com/privacy-scaling-explorations/halo2/blob/main/halo2_backend/src/helpers.rs) Should we bring this kind of functionality to halo2curves ?

adria0 commented 1 month ago

I tried to new version of the serialization where SerdeObject is removed in halo2, and found out that there are more serialization traits! SerdePrimeField for example, implemented in Halo2... Concretely, we should decide what to do with the parts of serialization implemented in halo2 (see https://github.com/privacy-scaling-explorations/halo2/blob/main/halo2_backend/src/helpers.rs) Should we bring this kind of functionality to halo2curves ?

I think that implementing this helper in halo2curves, just adds more confusion, a library only has to provide one way to do the same thing.

I also question, why does this helper exists? It's just a simple wrapper. Exists maybe because halo2curves serialization was not really documented and was a way for implementer to clarify what can be done in halo2curves? I do not see this usefull in Halo2 in terms of cryptographic protocol.

guorong009 commented 1 month ago

@adria0 @davidnevadoc I've done some research why these helpers SerdeFormat(halo2), SerdeObject(halo2curves) are there. The SerdeObject trait was introduced in this PR. https://github.com/privacy-scaling-explorations/halo2curves/commit/5bcc8914a4d1d75901b62eeac3c4347f826f09d0 The SerdeFormat was introduced in this PR. https://github.com/privacy-scaling-explorations/halo2/pull/111 All of work was done by @jonathanpwang . Before that, there was no concept of Serde* in the halo2 and halo2curves. For example, the zcash/halo2 has only one option - CurveRead(compressed). https://github.com/zcash/halo2/blob/main/halo2_proofs/src/helpers.rs

I believe we should check why @jonathanpwang added these. If those traits & helpers do not have specific purpose we don't know, we can remove them, as @adria0 mentioned.

jonathanpwang commented 1 month ago

The reason they were added is that previously the default in halo2 and halo2curves was that serde always serialized and deserialized to the human-readable canonical representation of a BigInt as some u64 digits. However the internal struct stores the BigInt primes in Montgomery form. Therefore the serialization and deserialization was doing Montgomery reduction each time, which is not desirable if the serialization is being used for raw storage purposes that don't need human readability. For halo2 this was particularly relevant for proving keys.

So Serde* was created as a way to offer these separate "raw" (de)serde methods without interfering with the existing human readable serde methods.

I haven't been up to date on the current situation, so I can't tell if they are needed anymore, but this was the original motivation.

Edit: SerdeObject can be safely removed if you are able to define from_mont and to_mont somewhere else. The Serde* traits in halo2 was purely a way to extend traits beyond ff::Field. SerdeFormat is an enum to toggle between the different serialization methods.