arkworks-rs / algebra

Libraries for finite field, elliptic curve, and polynomial arithmetic
https://arkworks.rs
Apache License 2.0
608 stars 237 forks source link

Serialization of GT #658

Open vincenzoiovino opened 1 year ago

vincenzoiovino commented 1 year ago

Hello, I need to serialize GT elements for bls12-381. I noticed that the 12 field elements that form a GT element of bls12-381 generated by your library are different from the ones serialized by the MCL Library in C/C++, so I wonder whether this can depend on some inner details of your implementation that differ from MCL. Here you can find more details about GT in mcl: https://github.com/herumi/mcl/blob/master/api.md#notation Can you kindly provide an overview of the implementation details of GT in your library that can be relevant to serialize/deserialize in a nice way possibly compatible with MCL and other libraries? Thanks in advance

mmagician commented 1 year ago

@drskalman

vincenzoiovino commented 1 year ago

In the meanwhile: if this discrepancy depends on the representations and on the prime polynomials used to define the split extensions etc. would it be possible to inquire if it is easy to add as feature the possibility of defining other prime polynomials etc.? In this way we can guarantee a standard GT serialization for future and cross-library usage. Btw, I know that it is not very common to use GT serialization in crypto protocols but in my case I strongly make use of that.

burdges commented 1 year ago

@drskalman worked out an optimized G_T serialization, which likely he submits, so imho other G_T serializations should not be considered standard.

Afaik all field serialization are defined in https://github.com/arkworks-rs/algebra/tree/master/ff/src/fields/models At least the quadratic extension ones should derive from IRTF standards, making it inflexible. I've no idea about cubic ones though, maybe zcash fixes something in their trusted setup?

vincenzoiovino commented 1 year ago

Thanks. I am looking into that. At first sight, it is not obvious to deduce just from the code in which field GT is embedded and in turn all representations of the intermediate fields. Could you kindly publish a short paragraph like the one of Shingeo Mitsunari (see link above) that explains that? My concern is not about efficiency of serialization but universality and portability (even among future versions of arkworks). For G1,G2, there is no issue of portability and universality but for GT this does not seem the case. For instance, I do not know what the optimized GT serialization you are talking about is supposed to do but imagine that this optimization (or another optimization in the future) changes the actual output of the serialization, then the compatibility might be broken (some code can try to serialize/deserialize elements/strings that were serialized/deserialized differently in the past), and of course there is a clear issue of portability with other libraries.

burdges commented 1 year ago

Fp12 is the target field for bls12 curves, so the cubic field extension is sandwiched in the middle.

G_T is a subgroup of the multiplicative group of Fp12, so people encode it as Fp12 now, but doing so is suboptimal and must be abandoned. It's therefore good no standard for Fp12 exists, because we've less to throw in the trash.

At least in arkwork, they'd replace only PairingOutput<P>: Canonical* but not Fp12: Canonical* so one could still access the "obsolete" serialization.

I've no idea if MCL uses a different tower of field extensions, but if so then you could work out the conversion. It's maybe something simpler though like representing the base field using a different endianness or whatever.

burdges commented 1 year ago

As for costs, @drskalman trick makes G_T elements 3 times smaller, so 192 bytes instead of 576 bytes for bls12-381/377.

It appears @drskalman trick involves cube roots in the extension field, which cost lots. At the same time, we need a subgroup check when deserializing a G_T element, which similarly costs lots. It's likely these come out similarly expensive, but we'll only know by benchmarking them. I've perhaps assumed too much above.

You'll prefer the Fp12 serialization for trusted setups where you check them only once. You'd maybe want the Fp12 serialization for some interior Fiat-Shamir transform too, not sure.

vincenzoiovino commented 1 year ago

So what is the tower of field extensions used in arkworks? The one of MCL is given in the link in the first comment of this thread. I do not even need check on validity of GT elements. Actually MCL does not check validity directly in deserialization, you must explicitly check validity separately. I would be grateful to you if you can provide me more details on the field extensions or give me some explicit references.

The fact that the length will change it means that it will not be backwards compatible?

burdges commented 1 year ago

Appears the tower have the same degree sequence, not sure if the same quotients, but maybe it's pretty simple.

I'd say first see if https://github.com/MystenLabs/fastcrypto/blob/main/fastcrypto-zkp/src/bls12381/conversions.rs helps you.

Also @achimcc has a fork of zcash's bls12-381 which uses the arkworks hostcalls in substrate, so it converts projective points too, but likely the pairing result too.

Pratyush commented 1 year ago

I think that @burdges is on the right track that IMO it doesn't make sense to standardize the serialization of the extension field elements that comprise GT. Instead, we should think about a common serialization format for PairingOutputs (which, as @burdges noted, come from a small subgroup of the extension field). This should be what goes in CanonicalSerialize/Deserialize.

One can of course also write custom serializers/deserializers for specific bilinear groups that implicitly convert between different towers

vincenzoiovino commented 1 year ago

I think that @burdges is on the right track that IMO it doesn't make sense to standardize the serialization of the extension field elements that comprise GT. Instead, we should think about a common serialization format for PairingOutputs (which, as @burdges noted, come from a small subgroup of the extension field). This should be what goes in CanonicalSerialize/Deserialize.

One can of course also write custom serializers/deserializers for specific bilinear groups that implicitly convert between different towers

Maybe I was confusing stuff. I was asking exactly the standardization of PairingOutput. I used the field elements that comprise GT to check why the serialization with respect to MCL is different. So if you confirm that the towers are the same as in mcl, do you have some hints on why the serialization is different?

vincenzoiovino commented 1 year ago

Solved, it is like this: in arkworks a GT element is serialized as 6 pairs: (u1,v1),..,(u6, v6). In mcl the same element is instead: (f(v1), f(u1)), ..., (f(v6), f(u6)), where the function f just changes the endiannes of the input :-) (notice that not just the endianness changes but also the order in each pair). Still I would like as user of arkworks (and anyt library) that serialization/deserialization output is never changed so as to guarantee backwards compatibility. Thanks anyway

burdges commented 1 year ago

It's what these do then https://github.com/MystenLabs/fastcrypto/blob/main/fastcrypto-zkp/src/bls12381/conversions.rs#L138-L150

vincenzoiovino commented 1 year ago

It's what these do then https://github.com/MystenLabs/fastcrypto/blob/main/fastcrypto-zkp/src/bls12381/conversions.rs#L138-L150

I had not executed the code you pointed to me but at first sight I thought that it only switched the endiannes of the field elements, are you saying that it also switches the order in each pair? If yes, it can be also applied to this case. thanks for all