Closed Slixe closed 3 months ago
With chacha20 v0.9 you need to also provide the chacha20_force_neon
configuration flag described in the docs (note that it's NOT a crate feature). In v0.10 it probably will work as expected without providing this configuration flag (though we haven't implemented it yet).
Mea culpa for missing that! I was using a clone version (so v0.10.0-pre) and expected to work directly.
I just followed the README.md from https://crates.io/crates/chacha20
With chacha20_force_neon
I get the following now:
test chacha12_bench1_16b ... bench: 16.14 ns/iter (+/- 0.73) = 1000 MB/s
test chacha12_bench2_256b ... bench: 111.41 ns/iter (+/- 3.24) = 2306 MB/s
test chacha12_bench3_1kib ... bench: 440.84 ns/iter (+/- 11.52) = 2327 MB/s
test chacha12_bench4_16kib ... bench: 7,054.65 ns/iter (+/- 344.24) = 2322 MB/s
test chacha20_bench1_16b ... bench: 29.74 ns/iter (+/- 1.23) = 551 MB/s
test chacha20_bench2_256b ... bench: 187.54 ns/iter (+/- 4.40) = 1368 MB/s
test chacha20_bench3_1kib ... bench: 746.00 ns/iter (+/- 6.39) = 1372 MB/s
test chacha20_bench4_16kib ... bench: 11,934.89 ns/iter (+/- 130.76) = 1372 MB/s
test chacha8_bench1_16b ... bench: 11.72 ns/iter (+/- 0.46) = 1454 MB/s
test chacha8_bench2_256b ... bench: 73.39 ns/iter (+/- 2.77) = 3506 MB/s
test chacha8_bench3_1kib ... bench: 294.11 ns/iter (+/- 9.41) = 3482 MB/s
test chacha8_bench4_16kib ... bench: 4,701.54 ns/iter (+/- 167.79) = 3485 MB/s
I'm asking maybe too much but, can't the feature be directly detected and enabled ?
Adding compile-time detection should be relatively easy and we should do it for v0.10 (we essentially just need to tweak our cfg
s a bit). This should be sufficient in practice for AArch64 as discussed here.
It was previously nightly
-only but looks like it works on stable now, so yeah, we should be able to enable it unconditionally on aarch64
targets
Great to hear! Well, I have everything I need, thanks a lot for your hard work guys, I close the issue.
Here's a PR to enable it by default: https://github.com/RustCrypto/stream-ciphers/pull/361
Hey there, thanks for this amazing repo!
I'm currently using
chacha20
crate, unfortunately, on arm64 CPU I'm getting poor performances with and without-Ctarget-feature=+neon
.Am I missing something ? Thanks in advance :)