multiparty / jiff

JavaScript library for building web-based applications that employ secure multi-party computation (MPC).
https://multiparty.org/jiff/
MIT License
257 stars 52 forks source link

Computing performance as a mpc backbend of Conclave #224

Closed psy-mas closed 3 years ago

psy-mas commented 3 years ago

Hello, First of thanks for all your great job and I really appreciate it. I encounter some troubles while testing it out. I want to use jiff as the mpc-backend of the Conclave and successfully implemented it(Comorbidity queries). However, I found that it is a bit time-consuming in Aggregate Opration which contains a large number of bubble sort operations, and it is even slower than the implementation using Obliv-C. It takes about two hours to aggregate 20000 data in two parties. Are there some optimization options or is there something I missed? Once again, thanks for making such an awesome library!!!

KinanBab commented 3 years ago

Hey. I am not entirely familiar of how the aggregate operation is implemented in Conclave.

It would be helpful to determine whether the performance issues are because of the implementation of the aggregate with the JIFF backend itself, or the with the structure of the flow that you want Conclave to compute. It sounds like you are saying that the issue is within the aggregate implementation. If that's the case then your best bet is to either (1) re-implement the aggregate operation in a more efficient way (2) implement a more customized operation that matches exactly what you want, rather than a general aggregate.

In all cases, given that you say that the operation uses bubble sort, I am not surprised that it takes a long time. I recommend using something more efficient, like batcher (odd-even) sort, or some sort of sorting network. If (sorted) data is being accessed by a secret index, you can speed that one up considerably by using an MPC-version of binary search (linear mutliplications but logarithmic comparisons). It also might make sense to play around with the JIFF configuration, if you can represent your values in a smaller field than the default one, this will speed up comparison (and thus sorting) by quite a bit.

KinanBab commented 3 years ago

@bengetch might have more information about this, he is far more familiar with conclave than me.

psy-mas commented 3 years ago

Hey thank you very much for the professional explanation and the useful suggestion @KinanBab

In fact, I have represented my values in a really small field(the prime modulus was set to 10103 while the largest number that I compute is 10000). It is possible that the original implementation of conclave was so time-consuming that I did not notice any improvement even if I changed the prime modules to a small one.

Indeed, bubble-sort is inefficient as it has a worst-case and average complexity of О(n^2).I am trying to use somthing like odd-even sort in aggregate operation with lower time complexity to gain improvement. Also, pardon me for offense, I am looking forward to your suggestions if you don't mind @bengetch , LoL. I would appreciate it very much.

Once again, thanks for making such an awesome library!!! Best wishes!!

bengetch commented 3 years ago

Hi @psy-mas ! I actually can't remember why bubble-sort is used by default in conclave, I had implemented odd-even sort here, but must have never hooked it up more generally.

If you reference that function in this function (instead of bubbleSort) then you should be good to go. Also of interest might be congregation, which is a re-implementation of conclave. The hybrid & public operators are unfortunately not implemented in that library right now, but should be coming down the line whenever I get the time to do it. Congregation also has much more documentation than conclave, which might be helpful.

Thanks for your interest! Good luck

psy-mas commented 3 years ago

Hi @bengetch , thanks for replying. Good job for implementing congregation. I've installed it and found that it's so much easier to use as it's documented so well and its explanation is very clear. I would spend more time testing it.

Also, I've replaced the bubbleSort function with oddEvenSort function according to your instructions and run it up successfully. Unfortunately, I found the performance of aggregate operation using jiff backend is still very poor even after I replaced the bubbleSort with oddEvenSort, either in Conclave or in congregation.. Apart from these, is there any way to optimize?

Have you conducted a comparative experiment on the performance between jiff and obliv-c backend? I tested the performance of using Obliv-C as the backend, and it is very close to the result stated in your paper. Theoretically, just like the proof in the paper, JIFF should be much more efficient than Obliv-C like Sharemind, because there are not so many circuits nor so much communication cost. Or maybe I made a mistake when using it.However, I am still testing

Thank you for your outstanding contribution to the community. Best wishes!!

bengetch commented 3 years ago

@psy-mas Thanks for the kind words! I'm glad congregation is easier to use, I had several intentions with the rewrite and usability was definitely one of them.

Unfortunately, I'm not aware of any further optimizations in terms of query compilation for the aggregate operations. It might be worth noting that the execution time for the aggregation operations will increase as a function of the number of distinct keys in the input datasets, rather than the sizes of the datasets themselves. This is because congregation will split such queries so as to do as much local cleartext processing as possible, which eliminates key duplicates when doing grouped aggregations.

As far as other ways of speeding up execution times, I've looked into a couple other MPC backends like SCALE-MAMBA and Secrecy, but just haven't had time to learn / implement them.

Regarding Obliv-C vs. JIFF, I think you might find the performance times to be more context-dependent. As a garbled-circuit based system, Obliv-C is less latency-bound than a system like JIFF (i.e. - requires far less messages to be sent as part of the protocol). The communication overhead of JIFF is more a feature of secret-sharing based systems in general, though. I haven't done a lot of work in circuits vs. sharing approaches, though, so I'm not sure I can comment much further on that.

Hope that helps! Don't hesitate to open a pull request or an issue on congregation in the future if you'd like. I'd be happy to help explain anything you might get stuck with while developing / using it.

psy-mas commented 3 years ago

@bengetch Thanks for the explanation !! I have a better understanding of it now. I I also read this paper (Secrecy) a few weeks ago, and the paper said that the implementation will be open sourced in the near future.

So much interest in your work!! In order not to miss any wonderful work, I have followed you already. Looking forward to your implementation of SCALE-MAMBA and Secrecy!

Sorry for the mistake , I' @ the wrong one before. Best wishes!

psy-mas commented 3 years ago

I am closing this Issue,thanks!