Closed nph4rd closed 3 years ago
Nice work Arturo! And great if we can simplify the crate and the compilation options at the same time.
Please, do you have a benchmark that shows that the aes
crate is indeed as fast as the aesni
crate when the system supports AES-NI? Especially after preparing the package with Maturin.
(I think that we have a pending branch/issue for some automated benchmarks with Criterion but I never finished it)
Currently we only have what's in sycret/benches/
. The output is as follows:
With the changes:
Eq keygen time: [8.8554 us 8.8628 us 8.8739 us]
change: [-1.5175% -0.5167% +0.3938%] (p = 0.31 > 0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) high mild
6 (6.00%) high severe
Without the changes:
Eq keygen time: [6.1032 us 6.1108 us 6.1208 us]
change: [-31.143% -30.960% -30.808%] (p = 0.00 < 0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
So if I understand correctly, there is a slight decrease in performance? I do think it would be nice to have a more robust benchmark for comparison though.
Btw... Ran these with:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 142
Model name: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
Stepping: 12
CPU MHz: 800.007
CPU max MHz: 4600.0000
CPU min MHz: 400.0000
BogoMIPS: 3984.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
Indeed, thanks for the results. I'll try to add more benches to see if the performance drop is acceptable (it's a good time to do that, since we need benchmarks anyway). Let's keep the PR open for now!
Ok! Agreed :+1:
Force-pushed to rebase with master
and trigger benchmarks.
Thank you, Arturo! I double-checked the benchmarks on my laptop too, and indeed there is no significant performance hit. I'm merging this, thanks for the PR and for the issues discovered and solved on the way :)
Description
Since the
aesni
crate restricts the supported backends, this was causing some problems related to the usage ofSyMPC
onaarch64
architectures (See https://github.com/OpenMined/SyMPC/issues/286). Replacing this crate withaes
will help because, by default, "it performs runtime detection of CPU intrinsics and uses them if they are available".Also, I removed the flags in the Cargo config:
--emit=asm
because it was causing problems with Clippy, whilecargo bench
can be run just fine without it.-Ctarget-feature=+aes,+ssse3
because otherwise we would be forcing AES-NI as mentioned here.Closes #24
Affected Dependencies
aesni -> aes
How has this been tested?
Checklist