apple / swift-crypto

Open-source implementation of a substantial portion of the API of Apple CryptoKit suitable for use on Linux platforms.
https://apple.github.io/swift-crypto
Apache License 2.0
1.43k stars 151 forks source link

Reduce the number of heap allocations when computing hash digests #238

Open baarde opened 3 weeks ago

baarde commented 3 weeks ago

This PR reduces the number of heap allocations needed to compute a hash digest from 7 to 1.

Checklist

Motivation:

Currently, computing a hash digest requires 7 heap allocations:

  1. A first DigestContext object is created during init.
  2. Its EVP_MD_CTX struct is allocated by EVP_MD_CTX_new.
  3. md_data is allocated when initializing the context using EVP_DigestInit.
  4. A second DigestContext object is created during finalize
  5. Its EVP_MD_CTX struct is allocated by EVP_MD_CTX_new.
  6. md_data is allocated when copying the context using EVP_MD_CTX_copy.
  7. A temporary array is created to store the result of EVP_DigestFinalize.

This means that for small messages the HashFunction.hash method spends most of its time allocating and deallocating memory.

Generally, users will only compute a few hash digests and aren't really concerned by the performance. But sometimes, it may be necessary to compute a lot of hash digests as fast as possible. Possible use cases include:

Modifications:

Result:

On its own, this change makes hashing about 2x faster and computing HMAC authentication codes about 1.5x faster.

On my machine (MacBook Pro 14-inch 2021) with suggested changes (@exclusivity(unchecked) between parentheses) MD5: 19.1 → 10.5 seconds (9.0 seconds) SHA1: 15.8 → 7.5 seconds (6.2 seconds) SHA256: 16.1 → 7.7 seconds (6.3 seconds) SHA384: 20.4 → 10.6 seconds (9.2 seconds) SHA512: 19.8 → 10.6 seconds (9.1 seconds) HMAC-MD5: 50.4 → 31.0 seconds (28.6 seconds) HMAC-SHA1: 42.9 → 25.1 seconds (22.9 seconds) HMAC-SHA256: 43.5 → 25.7 seconds (23.1 seconds) HMAC-SHA384: 52.0 → 31.4 seconds (29.2 seconds) HMAC-SHA512: 50.5 → 32.0 seconds (29.3 seconds)
Test code ```swift import Crypto import Foundation func testDigest(_ type: H.Type) { let start = Date() let iterations = 50_000_000 var hasher = H() let original = hasher var digest = hasher.finalize() for _ in 1 ..< iterations { hasher = original digest.withUnsafeBytes { bytes in hasher.update(data: bytes) } digest = hasher.finalize() } let duration = -start.timeIntervalSinceNow print("\(H.self): \(String(format: "%.1f", duration)) seconds") } func testHMAC(_ type: H.Type) { let start = Date() let iterations = 50_000_000 let key = SymmetricKey(data: Data()) var hmac = HMAC(key: key) let original = hmac var digest = hmac.finalize() for _ in 1 ..< iterations { hmac = original digest.withUnsafeBytes { bytes in hmac.update(data: bytes) } digest = hmac.finalize() } let duration = -start.timeIntervalSinceNow print("HMAC-\(H.self): \(String(format: "%.1f", duration)) seconds") } testDigest(Crypto.Insecure.MD5.self) testDigest(Crypto.Insecure.SHA1.self) testDigest(Crypto.SHA256.self) testDigest(Crypto.SHA384.self) testDigest(Crypto.SHA512.self) testHMAC(Crypto.Insecure.MD5.self) testHMAC(Crypto.Insecure.SHA1.self) testHMAC(Crypto.SHA256.self) testHMAC(Crypto.SHA384.self) testHMAC(Crypto.SHA512.self) ```
baarde commented 3 weeks ago

@Lukasa Thanks for your feedback. I've applied your suggestions. I agree: it is better.

baarde commented 1 week ago

I've rewritten the code to use the specialized MD5_*/SHA*_* APIs instead of the generic EVP_MD_* APIs.

I've also made sure that context and digest buffers get zeroized after use (EVP_DigestFinal and EVP_MD_CTX_free already did that for context buffers).

One noticeable "hack" is the use of the @exclusivity attribute to get a +15% performance improvement. This attribute was proposed in SE-0176 and implemented in Swift 5.7, but is currently undocumented.

Lukasa commented 1 week ago

@swift-server-bot add to allowlist