emp-toolkit / emp-agmpc

Global-Scale Secure Multiparty Computation
Other
40 stars 23 forks source link

AG-MPC with FerretCOT? #12

Open weikengchen opened 3 years ago

weikengchen commented 3 years ago

It seems that FerretCOT architecture has been stabilized. Would AG-MPC use FerretCOT by default? Any suggestions on the number of threads to be used? (How major the computation cost is in FerretCOT?)

I would cc @carlweng here!

weikengchen commented 3 years ago

By the way, I am also working to plug it in. May push a PR if I make some good progress.

weikengchen commented 3 years ago

Some update.

I have implemented a prototype that replaces IKNP with FerretCOT: https://github.com/weikengchen/emp-agmpc/commit/7814c9ea29f80979324a57f4367095f25f14fa76.

It is not ready for PR or deployment, as the code sometimes produces a segfault, sometimes okay.

There are still three challenges in using the current implementation of FerretCOT by default in emp-agmpc:

  1. FerretCOT needs many NetIO for multithreading, which would require changes to emp-agmpc to create and pass these. It might be a useful alternative to only use multithreading for the computation part, but not the communication part (which ideally would not be the bottleneck?), which would simplify the code.

  2. FerretCOT generates many, many COT at once. FerretCOT generates roughly n COT in one call, as the output of the LPN. In practice, this might not be suitable for one-time garbling of a small circuit (many are wasted) but would be useful if the circuit is large, or many circuits need to be garbled. One potential solution is to compute only the partial result of LPN, depending on how many COT are needed. From the code and construction, this seems possible.

  3. FerretCOT would be very efficient with many cores per batch of COT, yet in AG-MPC, we need pairs of COT batch between every two parties, so in a multiparty setting, to fully use the benefits of FerretCOT, it would need a lot of cores (num_of_party * num_COT_core).

carlweng commented 3 years ago

In regard to FerretCOT:

  1. It is possible to use only 1 NetIO, but some parts of the code will have to be reconstructed. It is definitely reasonable to give this choice in the future.
  2. I think for now we can assume it will be used for large circuits. If the circuit is small, the user can use the IKNP directly.
  3. The performance of FerretCOT does not deteriorate much when I change from 4 threads to only 1 thread, so it does not heavily rely on muti-cores. You can use 1 thread for now, in this way actually only 1 NetIO is needed.
wangxiao1254 commented 3 years ago

emp now supports opening many netio from the same port, so there is not much harm to have many netio, right?

On Sun, Dec 6, 2020 at 10:31 AM CK Weng notifications@github.com wrote:

In regard to FerretCOT:

  1. It is possible to use only 1 NetIO, but some parts of the code will have to be reconstructed. It is definitely reasonable to give this choice in the future.
  2. I think for now we can assume it will be used for large circuits. If the circuit is small, the user can use the IKNP directly.
  3. The performance of FerretCOT does not deteriorate much when I change from 4 threads to only 1 thread, so it does not heavily rely on muti-cores. You can use 1 thread for now, in this way actually only 1 NetIO is needed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/emp-toolkit/emp-agmpc/issues/12#issuecomment-739527032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARKGCWY4VQRMNQSSSWDZ4TSTOWULANCNFSM4UN3Q3LQ .

-- Sent from Gmail Mobile

weikengchen commented 3 years ago

Thanks! I updated my prototype so that it would approximate the amortized FUND_IND cost based on how many OTs are ready, how many OTs are used, by the following code:

if(party == 1) {
    int cot_limit = mpc->fpre->abit->abit1[2]->ot_limit;
    int cot_used = mpc->fpre->abit->abit1[2]->ot_used;

    cout << "COT limit/used: " << cot_limit << "\t" << cot_used << " \n"<< flush;
    cout <<"FUNC_IND adjusted:\t"<<party<<"\t"<<t2 * (1.0 * cot_used / cot_limit)<<" \n"<<flush;
}

which would help people who want to do a benchmark.

The corresponding PR is here: https://github.com/weikengchen/emp-agmpc/commit/364453cccdde71511585250c52e9550e0aadefeb

This seems something quite similar to HE-based SPDZ, in that the offline phase may produce much more triples than a specific program needs. In HE-based SPDZ it is due to the packing of FHE ciphertexts and batching multiple ciphertexts in one network packet to alleviate the effect of network latency. Here, it is for LPN.

weikengchen commented 3 years ago

(And the prototype has occasion segfault because all the FerretCOT instances I used want to read/write to the same pre_ot_data_reg_recv/send files. ~Will fix soon.~ It has been fixed.)