Open randombit opened 6 years ago
Hi Jack, thank you so much for your detailed look into the implementation. That is exactly what I was hoping for when I open sourced SwiftTLS. I am aware of the possibility of timing attacks in the abstract, but I did not dare to implement mitigations until someone weighed in with some experience doing so.
How would you fix the modular_pow implementation? I was thinking of something like this (or is there a way that does not make all calculations take the maximum time possible?):
for i in 0..<numBits
{
// Do this calculation even when we don't need the result to prevent timing attacks
let tmp = (result * r) % mod
if (exponent.isBitSet(i)) {
result = tmp
}
r = (r * r) % mod
}
Is there a better way to implement the modular reduction?
I will give the other two issues a try.
Thanks again, this helps a lot!
What you propose for modular exponentiation mostly works. This technique is sometimes called "square-and-always-multiply". It hides the timing channel, but someone doing a cache based side channel can still detect exponent bits. The basic intuition is, an attacker uses an instruction like clflush
to flush the bit of code representing "result = tmp" from memory. Then they sleep a short while, and then try to access that cache line. If the access is slow, then they know it was uncached, ie that the signing process did not access that bit of code in the preceding period of time. So they can conclude the exponent bit was 0. If the access was fast, they know it was a 1. There are many different variations on this attack for example http://palms.ee.princeton.edu/system/files/SP_vfinal.pdf and https://eprint.iacr.org/2013/448.pdf
The only general fix to these is to avoid any conditional branches or memory lookups which depend on secret information. Two common ways to do this are a Montgomery ladder, or fixed window exponentiation using a const-time table lookup. Slide 18 of https://cryptojedi.org/peter/data/shmoocon-20150118.pdf has an example of such a table lookup.
Is there a better way to implement the modular reduction?
Yes! Look into Montgomery reduction. There are many papers on making Montgomery reduction fast and constant-time. http://www.people.vcu.edu/~jwang3/CMSC691/j34monex.pdf starts with a good intro to the basic idea.
or is there a way that does not make all calculations take the maximum time possible
In general no, because if there is some circumstance where the calculations take less time than the maximum, then that provides a detectable side channel. The exception is when the information being used to is already public. For instance during RSA verification, there is no secret involved, so therefore no problem with using variable-time algorithms.
I have split this issue into two, #10 and #11.
Apologies if these are already known to you, but I didn't see anything in the docs or comments about this.
In (https://github.com/nsc/SwiftTLS/blob/f5010aa3b0901ab3f5f26db322ac8652e64f0c5c/SwiftTLS/Sources/TLS/math.swift#L15) you implement modular exponentiation using square-and-multiply. However this leaks the bits of the exponent (which in RSA and DH are secrets) to a timing or side channel attack. Also the reductions are implemented using Knuth's Algorithm D, which probably leak information about the inputs.
In ECDSA signature https://github.com/nsc/SwiftTLS/blob/master/SwiftTLS/Sources/Crypto/ECDSA.swift#L72 you must invert the k nonce modulo the group order during ECDSA signature. This is done using extended Euclidean algorithm (https://github.com/nsc/SwiftTLS/blob/f5010aa3b0901ab3f5f26db322ac8652e64f0c5c/SwiftTLS/Sources/TLS/math.swift#L73) which is known to leak information due to input dependent branches. The most common solution to this is to use Fermat's little theorem to compute the inverse (for any prime p and any x < p, x^{p-2} mod p == x^-1 mod p)
Decoding for PKCS1v1.5 ciphertexts exits early if the first bytes of padding are incorrect. https://github.com/nsc/SwiftTLS/blob/f5010aa3b0901ab3f5f26db322ac8652e64f0c5c/SwiftTLS/Sources/Crypto/RSA-PKCS1.swift#L100 which creates a padding oracle.
Similarly, when the PKCS1v1.5 ciphertext is decrypted, the server returns an error immediately https://github.com/nsc/SwiftTLS/blob/f5010aa3b0901ab3f5f26db322ac8652e64f0c5c/SwiftTLS/Sources/TLS/Protocol/TLS%201.2/TLS1_2.ServerProtocol.swift#L285 which creates a padding oracle that doesn't even require a side channel to execute since an attacker knows the PKCSv1.5 ciphertext padding was valid iff the server does not abort the connection. The correct way to handle this is to first generate a random 48 byte premaster, then RSA decrypt, then if the RSA ciphertext was invalid carry on using the randomized premaster. Then there is no oracle for an attacker to use.
HTH