RustCrypto / asm-hashes

Assembly implementations of cryptographic hash functions
46 stars 26 forks source link

Add AArch64 sha1 and sha2 extension support #10

Closed linkmauve closed 4 years ago

linkmauve commented 4 years ago

Here are some benchmarks on my Nintendo Switch:

Without this patch (or without --features=asm):

test sha1_10      ... bench:          66 ns/iter (+/- 0) = 151 MB/s
test sha1_100     ... bench:         497 ns/iter (+/- 1) = 201 MB/s
test sha1_1000    ... bench:       4,799 ns/iter (+/- 16) = 208 MB/s
test sha1_10000   ... bench:      47,646 ns/iter (+/- 134) = 209 MB/s
test sha256_10    ... bench:          98 ns/iter (+/- 0) = 102 MB/s
test sha256_100   ... bench:         815 ns/iter (+/- 4) = 122 MB/s
test sha256_1000  ... bench:       7,980 ns/iter (+/- 76) = 125 MB/s
test sha256_10000 ... bench:      79,373 ns/iter (+/- 793) = 125 MB/s

With this patch:

test sha1_10      ... bench:          32 ns/iter (+/- 1) = 312 MB/s
test sha1_100     ... bench:         165 ns/iter (+/- 1) = 606 MB/s
test sha1_1000    ... bench:       1,480 ns/iter (+/- 5) = 675 MB/s
test sha1_10000   ... bench:      14,246 ns/iter (+/- 26) = 701 MB/s
test sha256_10    ... bench:          30 ns/iter (+/- 0) = 333 MB/s
test sha256_100   ... bench:         167 ns/iter (+/- 0) = 598 MB/s
test sha256_1000  ... bench:       1,554 ns/iter (+/- 7) = 643 MB/s
test sha256_10000 ... bench:      15,231 ns/iter (+/- 88) = 656 MB/s

Only the SHA-256 function is implemented, my console doesn’t support the sha512 extension.

There are probably scheduling fixes to be made, I haven’t paid any attention to latency.

I have also compared to OpenSSL, this implementation is up to 20% slower, probably because it does a function call every 64 bytes, instead of a tight loop.

tarcieri commented 4 years ago

Curious if you could wire up Travis's aarch64 support so we could CI this

linkmauve commented 4 years ago

Do you know which flags their CPUs have? I’ll have a look at doing this CI thing after I’m done with sha256.

tarcieri commented 4 years ago

I've never actually used it, so I don't offhand

tarcieri commented 4 years ago

@linkmauve doesn't look like a224dd2 did the trick... perhaps move the arch key to the toplevel?

tarcieri commented 4 years ago

@linkmauve ok, unrelated build failure this time but at least it attempted an ARM build.

That said, it seems like you'll probably need to go back to explicit build matrix entries for arm64, but run tests specifically for sha1 and sha2. Something like:

cargo test --package sha1 --package sha2
tarcieri commented 4 years ago

Whoops, my bad, --package sha-1

linkmauve commented 4 years ago

Hmm, it seems like that’s not the correct way to do it. And there is only one build now instead of three.

tarcieri commented 4 years ago

Whoops sorry, those are still the wrong package names. Try:

cargo test --package sha1-asm --package sha2-asm

That passes locally for me.

linkmauve commented 4 years ago

How do you test locally btw?

tarcieri commented 4 years ago

I was just making sure the command worked. It didn't actually test on ARM (and it seems there aren't specifically tests for these crates, which is unfortunate)

I can try to add them retroactively if we can get the build working.

linkmauve commented 4 years ago

Yay, it works!

tarcieri commented 4 years ago

Nice! Let me do another pass on this and I can merge it.

Not sure if I have access to release the associated crates though, unfortunately.

tarcieri commented 4 years ago

My main worry with merging this as-is is the lack of test vectors. You can probably reuse the ones from the sha-1 and sha2 crates.

linkmauve commented 4 years ago

I’ve already tested it with those, it was how I found most of the bugs while writing this code. ^^

See https://github.com/RustCrypto/hashes/pull/96

tarcieri commented 4 years ago

That's good, but it'd also be good if they ran in CI too.

linkmauve commented 4 years ago

This would add a test dependency on digest, do you really want that? Or maybe I could just copy the implementations, but meh.

tarcieri commented 4 years ago

This would add a test dependency on digest, do you really want that?

I can merge this as/is and we can circle back on that.