arkworks-rs / snark

Interfaces for Relations and SNARKs for these relations
https://www.arkworks.rs
Apache License 2.0
769 stars 203 forks source link

Ask for suggestions about encrypted input and output for groth16 #350

Closed amyzx closed 3 years ago

amyzx commented 3 years ago

Hi, I am a student learning zkSNARK with your library. And I want to implement the following situation.

The client C uses the server S to compute a function, e.g, x^3 + x = y. S will generate groth16 proof for its computation result. Currently, I have realized the proof and verify functions for plaintext x and y. Howerver, C doesn't want S to know its x and y. So C may encrypt x to E(x) and decrypt E(y) received from S.

Do you think it's a reasonable or practical situation? Is there any encryption schemes which support the combination between encryption and zkSNARK? I have searched for a long time, but have no idea how to implement it. I am not sure whether homomorphic encryption can support this situation. Could you give me any suggestions?

Thanks a lot! Looking forward to your reply.

weikengchen commented 3 years ago

~There is a paper called LegoSNARK that is related to this use case. It requires a variant of the Groth16 proof system, which can be done via a small modification.~

Sorry, what I mentioned above is not relevant.

It seems that you need to hide x and y from the servers who generate the proof. In this case, you need to use some sort of homomorphic encryption and verify the computation over the ciphertexts is done correctly inside zkSNARK. This can be pretty heavy.

It is doable, but there are also no existing implementations for it.

weikengchen commented 3 years ago

The issue is actually here: the fact that the client only sends E(x) and asks the servers to return E(y) already requires somewhat homomorphic encryption.

If we relax the requirement a little bit, in that the client sends E(x), E(x^2), E(x^3) to S, it will be easier to prove. However, as always a problem in delegated computation --- it might be faster for the client to do the entire thing themselves.

amyzx commented 3 years ago

The issue is actually here: the fact that the client only sends E(x) and asks the servers to return E(y) already requires somewhat homomorphic encryption.

If we relax the requirement a little bit, in that the client sends E(x), E(x^2), E(x^3) to S, it will be easier to prove. However, as always a problem in delegated computation --- it might be faster for the client to do the entire thing themselves.

Hi, thanks for your detailed reply. Do you mean the following description? I am not sure I fully understand your reply.

The cilent needs somewhat homomorphic encryption to protect x and y, and when generating proofs zkSNARK also uses homomorphic encryption to hide information. These two encryptions may be really heavy and not compatible, which is the reason why this situation is impracticable.

weikengchen commented 3 years ago

zkSNARK actually is not an encryption, and here the issue is that proving that S actually provides the correct E(y) in zkSNARK is not easy.

amyzx commented 3 years ago

zkSNARK actually is not an encryption, and here the issue is that proving that S actually provides the correct E(y) in zkSNARK is not easy.

But in zkSNARK, it uses some encryption techniques to hide some parameters. I think these operations may not work correctly when the input x is encrypted.

Let me see... So zkSNARK is not suitable for this situation.

Could you give me some other suggestions about how to let the server to prove that it computes E(y) correctly without knowing x and y? Thanks a lot!

weikengchen commented 3 years ago

It can still be done in zkSNARK yet the implementation effort will be high, and the performance will not be good.

I think you try to do the verifiable computation on ciphertexts, but FHE/SWHE is known for being very hard to do that. So, you may want to reconsider whether you should use FHE if you want this integrity guarantees.

amyzx commented 3 years ago

Thanks for your suggestions! I will search for other resources to figure out the solution, and also try to do some experiments with ark-snark. By the way, it would be nice if you could provide some tutorials about how to use ark-snark by giving some examples.

burdges commented 3 years ago

It's likely cheapest doing this with MPC protocols. I'd assume MPC would benefit from being implemented in Rust, especially with session types, but afaik the major toolkits https://wiki.mpcalliance.org/MP_SPDZ.html and https://wiki.mpcalliance.org/Scale_Mamba.html are both C++. You'll find other MPC libraries at https://github.com/rdragos/awesome-mpc

amyzx commented 3 years ago

It's likely cheapest doing this with MPC protocols. I'd assume MPC would benefit from being implemented in Rust, especially with session types, but afaik the major toolkits https://wiki.mpcalliance.org/MP_SPDZ.html and https://wiki.mpcalliance.org/Scale_Mamba.html are both C++. You'll find other MPC libraries at https://github.com/rdragos/awesome-mpc

Thanks a lot! These suggestions are really useful. I will also try to implement my situation with MPC.

amyzx commented 3 years ago

@weikengchen Hi, I am a beginner who try to use zkSNARK and hope to prove the correctness of two matrix multiplication, e.g, A*B = C. But there are some negative integers in these matrix. Does snark support negative elements' operation?

When I try to implement a circuit for this situation, I get the following errors. image

weikengchen commented 3 years ago

Nope. We do not have support for negative number computation so far. But what you can do is this:

for input x < 0,

amyzx commented 3 years ago

Thanks a lot! I will have try. By the way, are there any tutorials or docs that I can learn to use ark-snark?

weikengchen commented 3 years ago

@Pratyush has the plan to do something in that direction recently. But at this moment we only have arkworks-rs/r1cs-tutorial, but it is tailored for the constraint world.

amyzx commented 3 years ago

Thanks! Looking forward to seeing your excellent work.

amyzx commented 2 years ago

Hi, sorry to bother you @weikengchen . I have a question about the meaning of public input, constant and witness.

I wrote a circuit as follows. A hopes to prove that: *Y=XW, it knows all parameters, so it calculates the result Y by using i64 X and W. Finally, it compares the generated Fq by enforce_equal. B only knows X and Y**, it can verify the proof.

Since X and W are feed into the circuit without any processing, could B extract the information of W?

In a more common case, if I have a complex neural network, I have to prove the final result Y is correct. I could just compute the result by using all plaintext data, and then compare the generated results' Fq.

Do you think this is a reasonable proof process? I am not sure whether all inputs for a circuit should be converted to Fq by using FpVar::\<Fq>::new_input/new_witness/Constant.

use ark_r1cs_std::{fields::fp::FpVar};
use ark_relations::r1cs::{ConstraintSynthesizer, ConstraintSystemRef, SynthesisError}; 
use ark_r1cs_std::{eq::EqGadget}; 
extern crate csv;
extern crate ndarray;
extern crate ndarray_csv;
use crate::{util_zx_batch::*};
use ndarray::Array2; 

pub type FqVar = FpVar<Fr>;
pub type Fq = Fr;

#[derive(Debug, Clone)]
pub struct CryptenMatMulCircuit { 
    pub matmul_paras: Vec<Vec<Array2<i64>>>,    // contains X and W
    pub matmul_res: Vec<Array2<i64>>,                 // Y = X*W
    pub scale: i64,
}

impl ConstraintSynthesizer<Fr> for CryptenMatMulCircuit {
    fn generate_constraints(self, cs: ConstraintSystemRef<Fr>) -> Result<(), SynthesisError> {
         // call matmul function to compute Y based on X and W, all in i64
        let res = beaver_matmul(self.matmul_paras.clone(), self.scale);

        let rank_num = self.matmul_res.len();      
        let m = self.matmul_res[0].shape()[0];     //  rows 
        let n = self.matmul_res[0].shape()[1];      //  cols  

        for rank in 0..rank_num {
            let res_var = generate_fqvar2d(cs.clone(), res[rank].clone()); 
            let matmul_res_var = generate_fqvar2d(cs.clone(), self.matmul_res[rank].clone());  
            for i in 0..m {
                for j in 0..n {  
                    res_var[[i,j]].enforce_equal(&matmul_res_var[[i,j]]).unwrap(); 
                }
            }  
        } 
        Ok(())
    }
}

pub fn generate_fqvar2d(cs: ConstraintSystemRef<Fq>, input: Array2<i64>) 
    -> Array2<FqVar> {   
    return input.mapv(|item| {
        let val:u64 = i64::unsigned_abs(item);
        let fq: Fq = val.into();

        let item_fq = FpVar::<Fq>::new_witness(cs.clone(), || Ok(fq)).unwrap();

        if item < 0 {  
            FpVar::<Fq>::negate(&item_fq).ok();
        } 
        return item_fq;
    });
}
weikengchen commented 2 years ago

I did not find a material online that explains public input/constant/witness well, so let me have a try.

Consider that I want to convince someone that Y can be obtained by applying a matrix W to X, without telling this person (verifier) what W is.

In the non-zero-knowledge setting, where W does not need to be hidden, the prover can convince the verifier by simply telling W. A zero-knowledge proof replaces the witness with a short proof, which does not leak W.

And now let me go back to the code.

There are three issues.

First, think about what the SNARK is actually proving. It seems that the entire proof systems have NO input, but only two copies of the witnesses. So, what the SNARK is proving, is indeed:

There is a matrix W_1 that equals W_2.

And this SNARK does not reason about (or know anything about) X and Y, which should be allocated as input, and the matrix multiplication needs to be checked inside the SNARK.

Second, note that folding i64 numbers in Fq has an implicit requirement that all entries in W is small, otherwise there will be overflowing. This has some additional challenges and can be quite heavy, as now we need to prove that each entry in W is bounded by a lower bound and a upper bound.

Third, remember that both matrix X and Y are in i64, and if we perform the computation in Fq for X times W, the results do not equal Y (but under some modular relations). This would require even more efforts to do a modular reduction.

So basically I would like to let you know that verifying matrix multiplication over i64 is indeed quite challenging in SNARK. It is better to relax the problem a little bit. Do you have a specific use case for this? Maybe there is a simpler way.

And back to your original problem. It seems that you can simply opt in to do a secure multiparty computation. Encryption is not actually needed.

weikengchen commented 2 years ago

And for additional information, if I were you, I would start with non-SNARK approach, which would give you better performance.

https://github.com/emp-toolkit/emp-zk https://eprint.iacr.org/2021/730.pdf

amyzx commented 2 years ago

Actually, I utilize CrypTen which is a framework based on SMC to realize a encrypted inference situation as described in a tutorial, where the model and input data are private.

However, I found that CrypTen uses the "honest-but-curious" threat model, which indicates every party performs all the computations correctly. I guess when one party gives wrong results to the other party, it may affect the other party's final result.

Therefore, I hope to use zkSNARK or other tools to prove the correctness of all computations, which can enhance the security of encrypted model inference. My aim is to prove the correctness of intermediate computations in SMC.

To do so, I saved all intermediate results during CrypTen encrypted inference. And I wrote a rust version of model inference procedures, which can load intermediate files and redo all computations to obtain the same result.

However, I have no idea how to combine with zkSNARK. Thus, I wrote the previous wrong code which just proves two matrix are the same. As for the issues you mentioned above,

  1. If I could alloate X and Y as input, and check all computations inside the SNARK. Do you think this is a reasonable proof?
  2. I could modify the setting in CrypTen, which could give me lower precision integers, such as i16 or i32. So I guess this could alleviate the overflow problem.
  3. I will have a try to rewrote my program but I still need some time to fix these errors.

Thanks for your detailed reply and suggestions. I really appreciate all suggestions you provide, and will have a look at emp-zk later.

weikengchen commented 2 years ago

One possible solution that may allow you to reuse a lot of your code is SCALE-MAMBA, which offers malicious security.

And still, i feel the EMP-ZK fits your use case pretty well.

amyzx commented 2 years ago

Hi, @weikengchen Sorry to bother you again. I have several questions.

Fristly, I am confused about Fr, Fp, Fq etc. The following is my understanding about some arkworks librarys. Please correct me if I am wrong.

In ark-bls12-381, Fr = Fp256<FrParameters> is a struct for a finite group which used FrParameters. FrParameters implements 3 traits including Fp256Parameters, FftParameters and FpParameters. FpParameters defines parameters used in G_1 such as the group order.

In ark-r1cs-std, FpVar<F> is a enum which represent variables in filed F. It will create AllocatedFp<F> struct which represent a variable in constraint system.

When I hope to create a circuit for a computation problem, e.g, i64 matrix multiplication. I firstly map i64 values to a finite field F by into function. This field could be Fp256 or Fp384. Then I allocate field items in constraint system by FpVar's functions including new_witness, new_input, Constant, which calls AllocatedFp::new_variable. Finally, I can use enforce_equal or enforce_cmp to add constraints.

The process is: i64 -> finite field element -> variable in constrain system -> add constraint

My questions are:

  1. How to choose a finite filed type, e.g Fr which represents Fp256, or Fq which represents Fp384?
  2. What's the difference between contant and public_input? I find contant only generates linear combination.
  3. What's the meaning of enforce_equal? When I create a circuit which only contains constants and use enforce_equal function, I always generate valid proof. Even when I put wrong values into the circuit, which makes me feel confused.

Secondly, I wrote a program to prove that i64 matrix multiplication is correct, where X and Y are public inputs, and W is the witness. But it does not work.

My questions are:

  1. What's the correct way to generate negative x's corresponding field elment fq? I have implemented the following two functions.
    
    fn to_fq(x: i64) -> Fq {
    let val:u64 = i64::unsigned_abs(x); 
    let mut fq: Fq = val.into();  
    if x< 0 {  
        fq = - fq;   // neg_fq = modulus - fq
    }  
    fq
    } 

fn to_fpvar(x: i64) -> FpVar { let val:u64 = i64::unsigned_abs(x); let fq: Fq = val.into();
let tmp = FpVar::::Constant(fq); if x< 0 {
FpVar::::negate_in_place(&tmp).ok(); }
tmp }



5. I find that even if I don't add `enforce_equal` constraint, I can still get the correct verification result.  I am confused about the meaning of this function.
    This is my project [source code](https://github.com/amyzx/arkworks_learn).

Looking forward to your reply! Thanks!