Agree goals of tests and visualization

open-quantum-safe / profiling

Network-level performance testing of post-quantum cryptography using the OQS suite

https://openquantumsafe.org/benchmarking

MIT License

8 stars 7 forks source link

Agree goals of tests and visualization #40

Closed baentsch closed 3 years ago

baentsch commented 3 years ago

Following up on discussions in https://github.com/open-quantum-safe/liboqs/pull/928 :

it would be helpful for the OQS team to clarify what the purpose of the tests are:

Comparing algorithms vs Comparing algorithm evolution

Testing algorithm runtime variation for a given message vs testing algorithms runtime over a distribution of messages

My personal initial attempt to answer:

Algorithm evolution may not be as interesting as a comparison across algorithms (and their variants) Runtime variations for any algorithm (or algorithm variant) with any kind of dependency would be interesting to highlight (possibly as a new set of tests and visualizations).

dstebila commented 3 years ago

For item 1, I'd say the purpose is comparing algorithms, not comparing algorithm evolution. In fact many of the evolutions observed in our performance data are really consequences of us improving infrastructure, shared code, build options, etc. Useful for us to know that our changes are making things better, but not scientifically interesting.

For item 2, I'd lean towards "over a distribution of messages", but indeed it would be interesting to highlight algorithms with atypical dependency. It might be worth checking SUPERCOP's approach to message generation for signature benchmarking.

bhess commented 3 years ago

It might be worth checking SUPERCOP's approach to message generation for signature benchmarking.

SUPERCOP combines several variables in the signature benchmark: (1) message lengths up to 100000 bytes, (2) distinct message each time (randomly pre-generated for each iteration), (3) distinct key-pairs for each iteration. https://github.com/jedisct1/supercop/blob/master/crypto_sign/measure.c

We could use the same approach as SUPERCOP by default. For special additional tests, we could add command-line options to speed_sig that allow to fix certain parameters.

baentsch commented 3 years ago

We could use the same approach as SUPERCOP by default.

Silly question: What is missing in the SUPERCOP benchmarks? Or asked the other way: How can we avoid to replicate what's already been done? Do you have a link where they visualize their results? This is worse in terms of visualization than oqs-profiling's surely less-than-great visuals -- there must be something better for SUPERCOP. Also: Is there a way to compare across algorithms at one glance?

baentsch commented 3 years ago

After this long silence, I tend to close this issue with the current code base without hearing what (else) we should do different. The purpose stated by @dstebila I think we fulfil. No response to my last question above further seems to indicate our visuals/comparison capabilities are better than SUPERCOPs, so I added a comment about that here. Please re-open if you see concrete things to change/amend.