arkworks-rs / snark

Interfaces for Relations and SNARKs for these relations
https://www.arkworks.rs
Apache License 2.0
783 stars 208 forks source link

Implement the `BW6_761` curve. #152

Closed Pratyush closed 4 years ago

Pratyush commented 4 years ago

@yelhousni et al. have found a curve, BW6_761 which, like SW6, has scalar field = Bls12_377::Fq, but which has better efficiency properties. It would be great to have an implementation of this curve in our library to enable more efficient recursion.

The implementation consists primarily of the following steps: 1) Constructing the field, which just requires instantiating the appropriate constants for Fq and Fq6. 2) Writing code for pairings.

An excellent starting point is the implementation here, and the paper.

Closes #3

Pratyush commented 4 years ago

cc @howardwu @kobigurk @gakonst this is probably relevant to y'all

jon-chuang commented 4 years ago

Hi, I'd like to take this on. However, this is the first elliptic curve implementation I will undertake, although not the first crypto implementation. I have an idea of what I am doing, but I will have to read the paper carefully and may need some ocassional help.

Before that, I have a long list of questions/ideas that go beyond the scope of this issue.

  1. Why the lack of assembly/SIMD in Zexe? This can be implemented at the field/models level, or lower, at the bigint ops level.
  2. Zexe uses u64 by default. Is this a problem/suboptimal when porting to mobile/WASM/GPU which have 32-bit words?

Before I begin the next section, I would like to first say my priority for the next two weeks is to implement BW6-761. Regardless, if and when I find the time/reason:

Another project I am interested in is adding some support for GPU through wgpu-rs or some variant, although the WebGPU project is still in development. In particular, I am interesting in adding support for a highly parallel version of multiexponentiation. This could be very good for dedicated provers or personal computers.

I am also interested in adding support for the PLONK proving system, as there have been some exciting and interesting developments there recently. For instance, adding RAM for branching programs. Arithmetisation is not R1CS however, but similar to Groth16 permutation arguments. Since Groth16 is supported here, I assume this is fine. I think one exciting thing this would enable alongside the single-layer proof composition are "dark contracts". I would be interested to discuss how this should be implemented.

My enthusiasm stems mainly from the fact that I was starting to write my own rust crypto library, salsa, but zexe has many of the things I would like for the time being. There is a nice clean structure. Things that are of secondary interest to crypto for which I may want to fork zexe for my own purposes are EC primality testing and factoring.

Ideas I had for the (distant and ideal) future are writing a DSL in Rust that can compose sMPC (perhaps incl. garbled circuits), homomorphic encryption, ZK building blocks and generate all the necessary networking and data-passing between different components and doing optimisation (probably BayesOpt) in order to optimise the balance of different components to a particular hardware/communication setup. Pysyft, targeted at private machine learning, seems to think this DSL should be pytorch. I'm not so sure for now.

The DSL should include annotations that tell the system what kind of security is required for different parts of the computation, data ownership, differential privacy etc. Although, it has yet to be seen how big of an overlap there will be between privacy-preserving computation and SNARKs.

Another idea I had was to bring in a bit of functional verification, in the spirit of Microsoft's Everest Project or what AWS is doing with Galois Software's tools. In this case we might use something like Prusti.

Also I was wondering if it might be worth writing some macros to implement arithmetic for pseudo mersenne and generalised mersenne primes, which are more efficient than for general prime fields. I believe some BN curves use these, but I am not sure. This suggestion, however, is not in line with Zexe's strategy to write explicit implementations for given no. of limbs. May I know why this cannot be achieved with macros?

If you think any of these projects are interesting, I can start issues for them. However, I would like to prove myself a worthy contributor first by completing the BW6-761 implementation succesfully.

jon-chuang commented 4 years ago

To do:

ChaitanyaKonda commented 4 years ago

Hi @jon-chuang, just wanted to see what the status of this issue is? Having BW6 in zexe is something we're interested in too and noticed that you've done a good deal of thinking around this and seems to have started working on it. Would be very helpful

jon-chuang commented 4 years ago

Hi @ChaitanyaKonda , unfortunately, I have not begun work on it and will not be able to for the foreseeable future.

Feel free to work on it :) having your contribution will be great!

yelhousni commented 4 years ago

So I finished implementing and testing BW6-761 in zexe: https://github.com/yelhousni/zexe . There are about 10 branches testing different configurations with respect to the twist type (M or D), the twist degree (quadratic or sextic), the doubling/addition formulas (ABLR or AKLGL), the pairing type (ate and optimal ate) and the coordinates (affine vs projective). The fastest implem so far (which is the one described in the paper) is in the branch youssef/BW6-761-Fq-ABLR-2ML-M; it implements an optimal ate pairing in ABLR projective coordinates with a sextic M-twist (G2 over Fq, 2 small Miller loops (Alg.5) and a lattice-based optimized final exp (Alg.6)).

The benchmarking shows that BW6-761 is 12X faster for a full pairing and ~22X faster for a Groth16 proof verification (~4 Miller loops and 1 final exp). This is surprising but mainly due to the choice of affine coordinates in SW6. In fact, projective coordinates are always faster for a pairing except in some cases (for instance SW6) as suggested in LautMontNae10 when the inverse is optimized in tower extensions, which is not yet the case for SW6 implementation I guess #20.

full pairing

Capture d’écran 2020-05-28 à 12 45 48

Miller loop

Capture d’écran 2020-05-28 à 12 50 44

final exp

Capture d’écran 2020-05-28 à 12 55 36

There are few optimizations to do further:

Pratyush commented 4 years ago

This is great work @yelhousni ! If you’d look to open PR from that branch then that would be great and we can merge this version of the code already, and the optimizations as followups :)