data61 / MP-SPDZ

Versatile framework for multi-party computation
Other
951 stars 279 forks source link

Offline phase benchmarking and data generated #169

Closed Vomvas closed 3 years ago

Vomvas commented 3 years ago

Hello,

I wanted to ask if it's still the case that the MASCOT offline phase cannot be directly benchmarked, rather only via a "time_offline - time_online" approach.

My understanding is that tools like Setup-online.sh and the various -offline binaries are not suitable to generate the required data because they do so "insecurely" and also either generate triples or bits at a time.

1) Does insecurely mean that generated data is reused within one execution? If so, why is there a prompt of how many triples to generate in ot-offline.x? 2) Is this why this operation takes barely a second but the actual offline phase of my program is much more time consuming? 3) Can I generate first the triples, then the bits and use them to run the online phase? If so, how can I specify the number of bits to generate? Would this be secure? 4) Is there a way to take advantage of the phase separation in MASCOT/Overdrive, specifically the parallel and multi-threaded generation of preprocessed data, or is this already included in the secure offline+online execution of my program (when I run it with mascot.sh -L)?

Thanks so much!

mkskeller commented 3 years ago

I wanted to ask if it's still the case that the MASCOT offline phase cannot be directly benchmarked, rather only via a "time_offline - time_online" approach.

Some material like triples and bits can be benchmarked separately but not together.

My understanding is that tools like Setup-online.sh and the various -offline binaries are not suitable to generate the required data because they do so "insecurely" and also either generate triples or bits at a time.

setup-online.sh does not run the actual protocol but insecurely generate the expected output. The *-offline.x binaries indeed only generate one data type at the time.

1. Does _insecurely_ mean that generated data is reused within one execution? If so, why is there a prompt of how many triples to generate in `ot-offline.x`?

Reusing is one effect of the -DINSECURE, but there is also the fact the material is not generated properly when using setup-online.sh or Fake-Offline.x. ot-offline.x however does everything securely.

2. Is this why this operation takes barely a second but the actual offline phase of my program is much more time consuming?

What do you mean by "this operation"? Generally, the offline phase run properly is much more expensive than the online phase in MASCOT and Overdrive.

3. Can I generate first the triples, then the bits and use them to run the online phase? If so, how can I specify the number of bits to generate? Would this be _secure_?

In theory this is possible, but the code does not allow for this because the MAC key is generated anew every run, so the triples and bits cannot be used together.

4. Is there a way to take advantage of the phase separation in MASCOT/Overdrive, specifically the parallel and multi-threaded generation of preprocessed data, or is this already included in the secure offline+online execution of my program (when I run it with `mascot.sh -L`)?

The combined execution uses as many threads as there are in the specified computation (for example using @for_range_opt_multithread). It is not possible to specify a different number of threads for preprocessing.

Vomvas commented 3 years ago

Thanks for the quick reply!

Some material like triples and bits can be benchmarked separately but not together.

I see. A question regarding these "bits" (not considering daBits for mixed circuits for now) required by arithmetic circuits: does this refer to everything other than a multiplication triple, such as random input gadgets, secret shares, secret MACs and MAC keys etc or to something different?

What do you mean by "this operation"? Generally, the offline phase run properly is much more expensive than the online phase in MASCOT and Overdrive.

Yes this makes sense. By "this operation" I mean the execution of ot-offline.x. For example, when I run ot-offline.x to generate as many triples as the tutorial requires it completes in a couple seconds, then the online phase takes a couple more seconds, whereas the offline phase takes roughly 10 seconds (4 players and 10ms network RTT). Does this make sense or am I missing something? Generating bits also seems very fast, around 1 second long...

The combined execution uses as many threads as there are in the specified computation (for example using @for_range_opt_multithread). It is not possible to specify a different number of threads for preprocessing.

So there is no similar functionality like the ot-offline.x --nthreads for the combined execution?

mkskeller commented 3 years ago

Some material like triples and bits can be benchmarked separately but not together.

I see. A question regarding these "bits" (not considering daBits for mixed circuits for now) required by arithmetic circuits: does this refer to everything other than a multiplication triple, such as random input gadgets, secret shares, secret MACs and MAC keys etc or to something different?

By bits I mean secret random bits (0 or 1 50/50) that are important for non-linear computation such as comparisons.

What do you mean by "this operation"? Generally, the offline phase run properly is much more expensive than the online phase in MASCOT and Overdrive.

Yes this makes sense. By "this operation" I mean the execution of ot-offline.x. For example, when I run ot-offline.x to generate as many triples as the tutorial requires it completes in a couple seconds, then the online phase takes a couple more seconds, whereas the offline phase takes roughly 10 seconds (4 players and 10ms network RTT). Does this make sense or am I missing something? Generating bits also seems very fast, around 1 second long...

What do you execute with ot-offline.x. The following takes five seconds locally on my computer:

for i in 0 1 2 3; do ./ot-offline.x -N 4 -c -p $i -B -P -n 5000 -l 5 & true; done

This generates 5000 bits modulo a prime in five rounds with 4 parties, which is roughly what's required in the tutorial. The triples are around two seconds, so 10 seconds including online phase doesn't seem that off.

The combined execution uses as many threads as there are in the specified computation (for example using @for_range_opt_multithread). It is not possible to specify a different number of threads for preprocessing.

So there is no similar functionality like the ot-offline.x --nthreads for the combined execution?

Indeed, there isn't.

Vomvas commented 3 years ago

By bits I mean secret random bits (0 or 1 50/50) that are important for non-linear computation such as comparisons.

I see, thanks for the clarification.

What do you execute with ot-offline.x. The following takes five seconds locally on my computer:

for i in 0 1 2 3; do ./ot-offline.x -N 4 -c -p $i -B -P -n 5000 -l 5 & true; done

I was not using -P and -l 5, now the numbers are much closer indeed.

Indeed, there isn't.

Thank you!

I tried the Overdrive protocols similarly to ot-offline.x. For instance I generated triples using

for i in 0 1 2 3; do ./simple-offline.x -N 4 -p $i -n 1000 & true; done

I am expecting this to be faster than ot-offline.x, however I get a throughput of ~600 triples/sec compared to the ~1200 triples/sec I get with mascot. However I could not get -n 500 to work in order to achieve a fair comparison. The program seems to generate a large number of triples, around 600K, regardless of -n. I also noticed that the protocol spends some time in the beginning to check the ZK proofs. I assume these are linear to the number of triples requested and the throughput would not change much if I request, say, 60K triples instead of 600K?

mkskeller commented 3 years ago

The throughput should indeed be constant in the longer term. The Overdrive protocols inherently work in much larger batches due to the number of slots in the ciphertext and the nature of the zero-knowledge proofs whereas MASCOT works with any batch size. Furthermore, the trade-off between Overdrive and MASCOT is that Overdrive requires more computation while MASCOT requires more communication. Run locally, communication is relatively cheap, so it's less likely that Overdrive beats MASCOT.