openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
722 stars 38 forks source link

[REVIEW]: LibSWIFFT - A fast C/C++ Library for the SWIFFT Secure Homomorphic Hash Function #3040

Closed whedon closed 3 years ago

whedon commented 3 years ago

Submitting author: @yarongvili1 (Yaron Gvili) Repository: https://github.com/gvilitechltd/LibSWIFFT Version: v1.1.1 Editor: @danielskatz Reviewers: @gcdeshpande, @jarrah42, @henrykironde Archive: 10.5281/zenodo.4678576

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/af9654116bfbcd324cb9483c1e527b98"><img src="https://joss.theoj.org/papers/af9654116bfbcd324cb9483c1e527b98/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/af9654116bfbcd324cb9483c1e527b98/status.svg)](https://joss.theoj.org/papers/af9654116bfbcd324cb9483c1e527b98)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@gcdeshpande & @jarrah42 & henrykironde, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @danielskatz know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @gcdeshpande

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @jarrah42

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @henrykironde

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 3 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @gcdeshpande, @jarrah42 it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 3 years ago
Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=0.10 s (461.0 files/s, 41404.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C++                              8            145            281           1558
reStructuredText                 8            208            135            359
Markdown                         8            100              0            253
C/C++ Header                    10             88            310            225
C                                4             24            112            144
CMake                            4             21              0            113
CSS                              2              0              3             75
Python                           1             20             31             29
DOS Batch                        1              8              1             26
YAML                             1              4              7             11
make                             1              4              7              9
-------------------------------------------------------------------------------
SUM:                            48            622            887           2802
-------------------------------------------------------------------------------

Statistical information for the repository 'e26f019dc3182eb8d3673d81' was
gathered on 2021/02/15.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
Yaron Gvili                      4          2426              2          100.00

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
Yaron Gvili                2424           99.9          0.0               22.90
whedon commented 3 years ago

PDF failed to compile for issue #3040 with the following error:

Can't find any papers to compile :-(

danielskatz commented 3 years ago

@whedon generate pdf from branch joss-paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch joss-paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

danielskatz commented 3 years ago

@whedon check references from branch joss-paper

whedon commented 3 years ago
Attempting to check references... from custom branch joss-paper
whedon commented 3 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1007/978-3-540-71039-4_4 is OK

MISSING DOIs

- None

INVALID DOIs

- None
danielskatz commented 3 years ago

@gcdeshpande and @jarrah42 - Thanks for agreeing to review this submission.

This is the review thread for the paper. All of our communications will happen here from now on. Both reviewers have checklists at the top of this thread with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

Please read the first couple of comments in this issue carefully, so that you can accept the invitation from JOSS and be able to check items, and so that you don't get overwhelmed with notifications from other activities in JOSS.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#3040 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use Whedon (our bot) to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@danielskatz) if you have any questions/concerns.

danielskatz commented 3 years ago

@whedo add @henrykironde as reviewer

danielskatz commented 3 years ago

@whedon add @henrykironde as reviewer

whedon commented 3 years ago

OK, @henrykironde is now a reviewer

danielskatz commented 3 years ago

👋 @gcdeshpande, @jarrah42, @henrykironde - It looks like you all have started, which is great. I just want to check in and see how things are going.

whedon commented 3 years ago

:wave: @gcdeshpande, please update us on how your review is going (this is an automated reminder).

whedon commented 3 years ago

:wave: @jarrah42, please update us on how your review is going (this is an automated reminder).

jarrah42 commented 3 years ago

How are the functionality and performance claims normally verified? Does this require doing a performance assessment, or is there some other way to do this?

danielskatz commented 3 years ago

It's up to the reviewers to decide, to some extent. For example, https://joss.readthedocs.io/en/latest/review_criteria.html#functionality says

Reviewers are expected to install the software they are reviewing and to verify the core functionality of the software.

Which to me means that you should check enough to have reasonable confidence in the rest.

henrykironde commented 3 years ago

I will try to finish the evaluation by the end of this week.

danielskatz commented 3 years ago

Thanks @henrykironde!

danielskatz commented 3 years ago

👋 @gcdeshpande and @jarrah42 - how are your reviews coming?

jarrah42 commented 3 years ago

I've finally installed it on a VM but the tests fail. The installation didn't proceed exactly as documented. Not sure what to do at this point.

gvilitech commented 3 years ago

I've finally installed it on a VM but the tests fail. The installation didn't proceed exactly as documented. Not sure what to do at this point.

Could you open an issue at https://github.com/gvilitechltd/LibSWIFFT/issues in which you describe the steps you took and the output you observed? We'll look into it.

Assuming you're using Linux, if you haven't tried the following already, please try the quick start given at the top of the LibSWIFFT repository page:

Quick start on Linux: Ensure docker is installed and runnable, clone this repository and go to its root directory, and run the command docker build . && docker run --rm -it $(docker build -q .) to build and test LibSWIFFT.

henrykironde commented 3 years ago

I like the Quick start instructions 👍

jarrah42 commented 3 years ago

I opened #17 a couple of days ago on the installation issue. This is a vanilla Ubuntu 20.04 so I would expect the tests to pass, but I'll open another issue when I get some time to look at it again.

jarrah42 commented 3 years ago

Ok, I've completed my review. I was unable to get the performance tests to succeed on a VM, however I finally got them to work on my Mac using the docker container. Other comments:

henrykironde commented 3 years ago

@gvilitech I am still trying to find away to measure the performance(Last step). Is it possible to add comparable performance tests against the other implementations? This is where you call all the api providing them with the same input with a goal to have LibSWIFFT at least as fast as other implementations of SWIFFT and often faster.

danielskatz commented 3 years ago

Thanks @jarrah42 and @henrykironde - I hope we hear back from @gvilitech / @yarongvili1 soon. And @gcdeshpande, any update?

yarongvili1 commented 3 years ago

@jarrah42: Thanks for your review and comments. Regarding the performance tests, their results tend to vary considerably across environments and so they were not made a part of the default test suite. In order to get them to work on a VM, one thing I would suggest trying is configure the VM as follows: 2+ cores available to the VM, 1+GB RAM, VT-x/AMD-V, Nested Paging, PAE/NX, KVM Paravirtualization. To your comments, in order:

I intend to create issues for your points as soon as their asks become clear to me.

@henrykironde: There is a plan to write a substantial comparative performance analysis in a future academic work that will be using LibSWIFFT. Currently, the measure used for comparing LibSWIFFT to other libraries is cycles per byte (cpb), and we have these numbers:

One thing I could try is to measure cpb for both on the same hardware, assuming it's easy enough to do with the K2SN-MSS code.

danielskatz commented 3 years ago

Re the substantial scholarly effort part, the editors decided that it was sufficient for this to proceed to review, but the reviewers are still allowed to bring this up as a point they want to discuss. @jarrah42 - what is your concern here?

henrykironde commented 3 years ago

One thing I could try is to measure cpb for both on the same hardware, assuming it's easy enough to do with the K2SN-MSS code.

@yarongvili1 I do recommend that we get some tests on at least one of the other referenced tools.

yarongvili1 commented 3 years ago

Looking at K2SN-MSS repository linked from the K2SN-MSS paper, I see it does not have build instructions nor build scripts. A quick attempt I made to build the code failed with lots of compilation errors. I'll see what I can do.

Note that the other implementation, which is the original one, does not support AVX2 and so is not a great point of reference for performance purposes. It would not make sense to spend effort on it to make it support AVX2.

yarongvili1 commented 3 years ago

OK, it was far from convenient, however I was able to modify the K2SN-MSS code so as to get a comparable performance test of its fastest SWIFFT function, the 16-bit one. The modified code led to the following comparison results:

This nicely supports the performance claim of LibSWIFFT. @henrykironde, if you like, you may use the provided instructions to run the performance tests of both repositories on your hardware of choice.

henrykironde commented 3 years ago

@yarongvili1 thank you for the update. Since this paper and software show that LibSWIFFTis a fast C/C++ Library for the SWIFFT Secure Homomorphic Hash Function. We should submit a PR to https://github.com/anon1985/K2SN-MSS.

Then if we could include tests the two libraries where LibSWIFFT: ~[4.29] cpb is tested to be at most (<=) K2SN-MSS: ~[5.76] cpb.

This could be part of the continuous integrated tests or we add the instructions to manually test it in the Readme file of LibSWIFFT.

I do believe this will be the most important part of finalizing this evaluation and confirming that the software is indeed as defined. This I believe will cover the Functionality, Performance, State of the field including the Substantial scholarly effort based on the main reference K2SN-MSS paper.

Please let me know what you think.

yarongvili1 commented 3 years ago

@henrykironde, let me make sure I understand what you have in mind.

Must the PR to K2SN-MSS must be accepted in order for this submission to proceed? I don't have control over this acceptance decision. I have doubts the K2SN-MSS repository owners would even respond, since it has been 2 years since the last commit to that repository. Moreover, their repository does not build as it is, suggesting it was put there as part of publishing their paper but not much more. The approach taken is to fork their repository and make the minimal (and not elegant) changes needed to get a comparable performance result, so a simple solution could be to refer to the fork from LibSWIFFT documentation. What do you think?

Regarding continuous integration testing, there is a problem that GitHub Workflows provides a (virtual) host with limited resources. Such a host is fine for correctness testing but terrible for performance testing. The way LibSWIFFT deals with this problem is by making the performance tests non-default so they can be ran manually on a modern commodity host. Currently, LibSWIFFT documentation has instructions for performance testing (here) but I guess you meant they should appear in the README or the online docs. I could move these instructions to the README along with a link to the fork's performance testing instructions. What do you think?

henrykironde commented 3 years ago

1) Yes I noticed that the user has not contributed to work for years and not active, so I do not think that there would be any response, but submitting a PR is good. We can test using your branch as the evaluations comes to an end.

2) I have seen those test, I would suggest having something like

Running Performance Tests

Performance tests are executed using a release build:

mkdir -p build/release
cd build/release
cmake -DCMAKE_BUILD_TYPE=Release ../..
make
./test/swifft_catch "[swifftperf]"

Running Performance Tests between LibSWIFFT and K2SN-MSS

We use the branch (PR submitted) https://github.com/gvilitechltd/K2SN-MSS/tree/swifftperf. The binary SWIFFT 16-bit (not general nor 8-bit) function is tested against LibSWIFFT. SWIFFT 16-bit was reported to be the fastest and the only referenced work having the source code.

1) Clone and build the latest LibSWIFFT 2) Clone and build the K2SN-MSS branch

cmake -Bbuild -H. -DCMAKE_BUILD_TYPE=Release
cd build
make
./tester_compare

The test case provides same input to both tools and shows the cpb results.

If we can get the implementation in https://github.com/gvilitechltd/K2SN-MSS/blob/swifftperf/tester.c#L28 and form a test case with LibSWIFFT

What do you think about this?

yarongvili1 commented 3 years ago

@henrykironde: Yes, something like this should work. I'll handle it.

yarongvili1 commented 3 years ago

I created https://github.com/anon1985/K2SN-MSS/pull/3

jarrah42 commented 3 years ago

@yarongvili1:

  • A cross-platform build was not a design goal for v1.0.0 of LibSWIFFT, which could grow in that direction in the future. I agree the API should not be architecture-specific, so this could be an oversight - which APIs are you referring to?

I was referring to the interfaces that use _AVX, _AVX2, etc. which ties the code to a particular hardware feature. If you follow this approach, then any port to a new architecture with different features requires a new set of APIs. Conversely, if my code uses a specific feature set then porting it to a new architecture that doesn't support those features becomes problematic. I think there are better ways to do this that don't tie the code to the hardware (which after all is one of the main reasons for having an interface in the first place.)

  • Could you be more specific about the documentation you're expecting? Perhaps you could point to an existing project with documentation of the kind you'd like to see?

The developer documentation currently consists of brief description of each API. It doesn't provide any information on how to use the API, nor is there any description of the overall design/architecture with enough detail to allow someone to work on the code. On a separate note, I can't see any link from the github repo to the documentation on readthedocs.io which would be useful.

  • Could you elaborate on the expectations regarding citations? Do you mean in the README or in the JOSS paper? Note that the README links to the K2SN-MSS paper and the JOSS paper cites it, and an intro to a topic such as ZKPoK is usually seen in academic cryptography papers and not in GitHub repos.

I mean in the JOSS paper.

yarongvili1 commented 3 years ago

@jarrah42:

I was referring to the interfaces that use _AVX, _AVX2, etc. which ties the code to a particular hardware feature.

Got it. The API is designed this way to support a couple of use cases:

While LibSWIFFT includes microarchitecture-specific code, it is not really tied to any one; it is designed to compile no more than what is supported by the platform on which it is built.

If you follow this approach, then any port to a new architecture with different features requires a new set of APIs. Conversely, if my code uses a specific feature set then porting it to a new architecture that doesn't support those features becomes problematic.

This is true, though not harmful. The regular user would still invoke the main API functions, each of which just invoke the best corresponding microarchitecture-specific API function. Having multiple sets of syntactically- and semantically-similar APIs is a reasonable choice when all have to co-exist, which means they must have different symbols. This doesn't result in a portability issue; as said above, the code is designed to compile no more than what is supported by the platform on which it is built.

I think there are better ways to do this that don't tie the code to the hardware (which after all is one of the main reasons for having an interface in the first place.)

Could you suggest such a way for consideration? Is this a requirement for acceptance?

The developer documentation currently consists of brief description of each API. It doesn't provide any information on how to use the API, nor is there any description of the overall design/architecture with enough detail to allow someone to work on the code.

I guess by brief documentation you mean the doxygen. There is also a C and C++ example code of typical usage in the README. I could add a more elaborate explanation of each API used in these examples. Regarding architecture and design documentation, there is currently doxygen documentation that covers this, though I'd agree this is scattered and it would be useful to gather in one place.

On a separate note, I can't see any link from the github repo to the documentation on readthedocs.io which would be useful.

It is linked from the "docs passing" badge in the README, though that's indeed not intuitive. I'll add a more visible link.

I mean in the JOSS paper.

I'll convert links in the paper to citations.

jarrah42 commented 3 years ago

I think there are better ways to do this that don't tie the code to the hardware (which after all is one of the main reasons for having an interface in the first place.)

Could you suggest such a way for consideration? Is this a requirement for acceptance?

No, I wasn't raising this as a condition of acceptance. Only that it's something you might want to consider. I'll open a GitHub issue for more discussion.

Thanks for addressing the other points.

yarongvili1 commented 3 years ago

@henrykironde and @jarrah42, I believe the issues you raised are now handled. For the performance comparison with K2SN-MSS, I opted for a docker-based solution, for ease of use.

yarongvili1 commented 3 years ago

@danielskatz, what are the next steps?

danielskatz commented 3 years ago

We seem to have lost @gcdeshpande, which is ok, since we really just need two reviewers, but both @jarrah42 and @henrykironde have unchecked boxes. So the question is really to them:

What else is needed from your points of view?

jarrah42 commented 3 years ago

I have updated my checklist.

danielskatz commented 3 years ago

Thanks @jarrah42 - just to confirm, you would be satisfied for this to proceed to acceptance?

jarrah42 commented 3 years ago

Thanks @jarrah42 - just to confirm, you would be satisfied for this to proceed to acceptance?

Yes I'm satisfied.

henrykironde commented 3 years ago

Thanks @yarongvili1 for the patience and having to deal with all the request. @jarrah42, I am satisfied with the library and I recommend acceptance.

yarongvili1 commented 3 years ago

Thanks @jarrah42 and @henrykironde for your helpful reviews, and @danielskatz for editing!

danielskatz commented 3 years ago

Thanks @jarrah42 and @henrykironde!