mirage / mirage-crypto

Cryptographic primitives for OCaml, in OCaml (also used in MirageOS)
ISC License
77 stars 43 forks source link

P-521 sign not constant-time #228

Closed ansiwen closed 4 months ago

ansiwen commented 4 months ago

While doing performance measurements I noticed, that P-521 signatures are not constant-time, which I assume they should be. P-256 is not affected. I see the same behavior with and without the recent performance improvements for elliptic curves. However, I'm not using the latest mirage-crypto, but 0.10.7. Here the result of a simple test script, always signing the same message but generating a new key every two measurements:

Generating new key
requests:  1402 ( 4 conn) time: 4s rps:  350.56 latency: 10.78ms/11.38ms/19.22ms
requests:  1347 ( 4 conn) time: 4s rps:  336.88 latency: 10.79ms/11.84ms/29.74ms
Generating new key
requests:  1583 ( 4 conn) time: 4s rps:  395.94 latency: 8.77ms/10.07ms/16.48ms
requests:  1591 ( 4 conn) time: 4s rps:  397.87 latency: 8.42ms/10.02ms/17.3ms
Generating new key
requests:  1658 ( 4 conn) time: 4s rps:  414.71 latency: 9.02ms/9.62ms/18.06ms
requests:  1667 ( 4 conn) time: 4s rps:  416.97 latency: 8.95ms/9.56ms/19.09ms
Generating new key
requests:   956 ( 4 conn) time: 4s rps:  239.14 latency: 10.52ms/16.68ms/23.01ms
requests:   951 ( 4 conn) time: 4s rps:  237.89 latency: 16.05ms/16.76ms/30.08ms
Generating new key
requests:  1150 ( 4 conn) time: 4s rps:  287.71 latency: 12.86ms/13.86ms/34.07ms
requests:  1192 ( 4 conn) time: 4s rps:  298.21 latency: 9.87ms/13.37ms/20.07ms
Generating new key
requests:  1240 ( 4 conn) time: 4s rps:  310.17 latency: 11.95ms/12.86ms/30.99ms
requests:  1276 ( 4 conn) time: 4s rps:  319.10 latency: 11.97ms/12.5ms/31.4ms
Generating new key
requests:   695 ( 4 conn) time: 4s rps:  173.82 latency: 12.77ms/22.94ms/29.86ms
requests:   703 ( 4 conn) time: 4s rps:  175.90 latency: 11.16ms/22.67ms/32.57ms
Generating new key
requests:  1261 ( 4 conn) time: 4s rps:  315.45 latency: 10.32ms/12.64ms/29.33ms
requests:  1256 ( 4 conn) time: 4s rps:  314.01 latency: 11.76ms/12.7ms/21.3ms

@Firobe could reproduce the effect with a modified mirage-crypto benchmark:

* [ecdsa-timing-key1]
    :  566.780 ops per second (5712 iters in 10.078)

* [ecdsa-timing-key2]
    :  480.910 ops per second (4830 iters in 10.043)

* [ecdsa-timing-key3]
    :  338.418 ops per second (3396 iters in 10.035)

* [ecdsa-timing-key4]
    :  681.658 ops per second (6811 iters in 9.992)

* [ecdsa-timing-key5]
    :  554.708 ops per second (5564 iters in 10.031)
Firobe commented 4 months ago

I believe the non-constant part is entirely due to the computation of k when not provided, which is mentioned by the documentation of sign as a risk.

On master, on a constant message and 5 random keys for P521, with k pre-computed with the same method as when not provided:

ops/s Verify Sign with pre-computed k Sign without k
Key 1 175.7 1189.9 573.6
Key 2 175.8 1188.1 487.1
Key 3 175.4 1190.3 344.3
Key 4 174.7 1186.4 695.0
Key 5 173.8 1179.7 563.0
Std. dev. 0.74 3.84 115.33

The computation of k seems to take a huge part of the total spent time (compared to other curves, see below), though I don't have an intuition yet for why . I'll note that the low-level primitives for P521 are generated from a different fiat-crypto algorithm (unsaturated solinas) than all other P-curves (word-by-word-montgomery). Maybe it's also worth to look into Digestif.SHA512.

For comparison, here are the same measurements on P256 (excluding verify):

ops/s Sign with pre-computed k Sign without k
Key 1 8586.3 8388.4
Key 2 8593.2 8467.0
Key 3 8612.1 8377.6
Key 4 8604.4 8363.0
Key 5 8626.3 8392.4
Std. dev. 14.13 36.13
hannesm commented 4 months ago

I'll note that the low-level primitives for P521 are generated from a different fiat-crypto algorithm (unsaturated solinas) than all other P-curves (word-by-word-montgomery).

That is not the case, all of them are using word-by-word-montgomery. The unsaturated salinas is used by ed25519.

Maybe worth to measure how often do_sign is called (which computes k and is recursive)? Maybe worth to investigate #105 - maybe the K_gen gen () function (as well recursive) is called too often?

@ansiwen for what it is worth, please note that old mirage-crypto releases are unsupported. If you like to report an issue in the future, please test with the latest release.

Firobe commented 4 months ago

That is not the case, all of them are using word-by-word-montgomery. The unsaturated salinas is used by ed25519.

Ah my bad, I looked into fiat-crypto directly assuming we'd have reused these.

Maybe worth to measure how often do_sign is called (which computes k and is recursive)?

It will be useful to measure it just in case, but it probably is called once in all cases, or else it would fail when I pre-compute and pass k (since again stops after the first iteration if k was passed).

Maybe worth to investigate https://github.com/mirage/mirage-crypto/issues/105 - maybe the K_gen gen () function (as well recursive) is called too often?

I'll try to continue to investigate, thanks for the ideas :)

ansiwen commented 4 months ago

@ansiwen for what it is worth, please note that old mirage-crypto releases are unsupported. If you like to report an issue in the future, please test with the latest release.

We did. That's why I added the reproduction of @Firobe, which was on master.

Firobe commented 4 months ago

@hannesm Your idea was spot on :) See #230

With this, here's the distribution over the same sample of keys

ops/s Sign without k
Key 1 1161
Key 2 1170
Key 3 1162
Key 4 1164
Key 5 1159
Std. dev. 6.05
ansiwen commented 4 months ago

@Firobe Well done! Great work! Just for my understanding: the k is calculated deterministically from the key, or why was it constant over identical keys?

Firobe commented 4 months ago

@ansiwen In classical ECDSA, k is chosen randomly from a good, uniform, unbiased random source. These characteristics are not always possible to have, and hard to test, so the RFC proposes a process to compute k deterministically from the key and the message instead (which is supposed to be indistiguishable from random from the PoV of an attacker).

This does mean the k is the same for each similar message, so it can be used to recover the key if there are other vulnerabilities (https://www.hertzbleed.com/2h2b.pdf). In practice to be truly secure, a k should be generated randomly if a good random source is available, and passed explicitly.