Closed brookman closed 4 months ago
The chacha20
and poly1305
crates have AVX2-optimized implementations on Intel. On ARM64 we use a portable software implementation instead and have no SIMD support for these crates on ARM64 platforms other than what LLVM's auto-vectorization might be doing.
The aes-gcm
crate, on the other hand, supports ARMv8 hardware acceleration for AES and GCM (i.e. PMULL) and will get you better results on M1.
There is a neon/aarch64 backend for the underlying RustCrypto/stream-ciphers/chacha20 which is supported on modern Apple arm processors but it must be enabled explicitly chacha20/README.md chacha20/src/backends.rs#L18
Thank you for the hint @tripplet ! I'm trying to enable it by adding the following to my .cargo/config.toml file:
[target.aarch64-linux-android]
rustflags = ["-C", "target-feature=+neon", "--cfg", "chacha20_force_neon"]
[target.aarch64-apple-ios]
rustflags = ["-C", "target-feature=+neon", "--cfg", "chacha20_force_neon"]
[target.aarch64-apple-darwin]
rustflags = ["-C", "target-feature=+neon", "--cfg", "chacha20_force_neon"]
Am I doing this correctly?
I'm not experienced in cross compilation, I only used rustflags = ["--cfg", "chacha20_force_neon"]
to compile on Mac for Mac but looks good to me.
@tripplet it works. Thank you very much! 😊
Hi. Thanks for providing the awesome crypto crates! 🦀 I was experimenting with the ChaCha20Poly1305 crate and found it to run 4x slower on a Mac with M1 CPU compared to an older Intel i7 Mac. Now I‘m wondering if this is a general issue with Rust (it seems that the arm64/AArch64-Apple-Darwin target is not tier 1 yet) or if there are specific optimizations in the RustCrypto crates for Intel. Or maybe I‘m missing a feature flag? Could somebody give me a hint here?
Thanks in advance!