marshallpierce commented 5 years ago

The recent (d224f85) addition of Nonce.kt and the associated code causes rapid exhaustion of the system entropy pool on Linux, which causes effectively unbounded blocking. This makes ktor effectively unusable on Linux.

Test case

Test case at https://bitbucket.org/marshallpierce/ktor-entropy-repro/src/master/ exhibits the issue by vending cookies repeatedly, which consumes nonces.

As an example on Linux, there is currently lots of entropy available (this is about as high as it ever gets). The system has a 4.14 kernel, so well past the entropy improvements in 4.8.

% cat /proc/sys/kernel/random/entropy_avail
3994

In that state, running the test in the above repro shows that it can serve 98 cookies before blocking due to entropy exhaustion.

Background

The problem stems from this:

val secureInstance = SecureRandom.getInstanceStrong()

That uses the providers specified by securerandom.strongAlgorithms property, which on macOS and Linux is this:

jshell> java.security.Security.getProperty( "securerandom.strongAlgorithms" )
$1 ==> "NativePRNGBlocking:SUN,DRBG:SUN"

This string comes from conf/security/java.security in the JDK:

securerandom.strongAlgorithms=NativePRNGBlocking:SUN,DRBG:SUN

NativePRNGBlocking is populated in sun.security.provider.SunEntries:

        if (NativePRNG.Blocking.isAvailable()) {
            map.put("SecureRandom.NativePRNGBlocking",
                "sun.security.provider.NativePRNG$Blocking");
            map.put("SecureRandom.NativePRNGBlocking ThreadSafe", "true");
        }

(The isAvailable() call will succeed if /dev/random exists.) sun.security.provider.NativePRNG$Blocking is pretty much what it sounds like: a PRNG that always reads from /dev/random.

Historically, /dev/random has blocked until the system gathered enough entropy from its entropy sources to populate the requested file read, and /dev/urandom has not blocked. However, as PRNGs have gotten better, that's no longer always the case. On macOS, for instance, neither will block, but on Linux, /dev/random will block, and on FreeBSD both will block only until initial seeding is completed.

The upshot is that with modern PRNGs, it's no longer suitable to require the blocking behavior of /dev/random once a system is booted and its PRNG has been seeded -- once a few hundred bits of state have been acquired, that's enough to provide random numbers effectively forever. See https://www.2uo.de/myths-about-urandom/ or https://tersesystems.com/blog/2015/12/17/the-right-way-to-use-securerandom/ or this thread for more on this.

Potential solutions

Use SecureRandom(), not SecureRandom.getInstanceStrong(). On Linux and macOS, this will result in using this entry:

            map.put("SecureRandom.NativePRNG",
                "sun.security.provider.NativePRNG");

That maps to this singleton in NativePRNG:

    private static final RandomIO INSTANCE = initIO(Variant.MIXED);

and Variant.MIXED is at least better: it sees from /dev/random, but uses /dev/urandom for nextBytes(), etc. On Linux, this lessens the risk of blocking since that will only read 64 bytes every 30s from /dev/random, and on macOS, the behavior is identical to SecureRandom.getInstanceStrong(). On Windows, DRBG will be used as per JEP 273.

Even better, let users configure the SecureRandom instances used. Then, Linux users can use NativePRNGNonBlocking for the "secure instance", and the "weak instance" can use something better than SHA1PRNG.

marshallpierce commented 5 years ago

Also, the operations performed to generate the nonce strings themselves are somewhat curious. Each nonce only has an effective strength of 64 bits, which is too low for many purposes, and the way the strings are generated is rather allocation-heavy, and the large channel size means most of the channel's contents will be outside cache. Also, I'm not really following what the point of mixing the "strong" source of randomness with SHA1PRNG is with an ad-hoc sprinkling of one "strong" byte every four "weak" bytes is. Why not just use one SecureRandom? Modern PRNGs can generate huge amounts of random bytes cheaply... That's all beside the point I guess, but it just looked odd to me looking over the code. Oh, and also System.currentTimeMillis() is probably not a good way to do the sort of timekeeping that the reseed logic does. It's fairly expensive to call, and it tracks "wall clock" time, not monotonic time, so it can go backwards as NTP slews the clock around. System.nanoTime() is a better way to measure elapsed time.

cy6erGn0m commented 5 years ago

The reason why we use SHA1PRNG is that it is relatively fast and doesn't require system calls while reading from a device is expensive. On the other side both PRNGSs are pseudo-secure. So mixing "secure" and "insecure" bytes is a concession to produce more bytes.

Regarding currentTimeMillis: it is faster that nanoTime() and backward-forward jumps is not a big problem. I guess your concern is backward leap. In this case nanoTime will simply freeze for long so there is no big difference. Perhaps we can check abs(delta) to handle negatives as well.

Also note that we do time checks only sometimes so for sure it's not a performance issue. Putting values to a channel could be even more expensive as we do it for every nonce.

marshallpierce commented 5 years ago

The reason why we use SHA1PRNG is that it is relatively fast and doesn't require system calls while reading from a device is expensive.

But the current logic calls secureInstance.nextBytes(secureBytes) every iteration so you're paying the system call overhead no matter what. At that point, why have a weaker PRNG? If you want tp prioritize speed, something like http://antirez.com/news/99 (you could use a ludicrous amount of state, like 256 bytes, with 8 bytes reserved for a long counter, and SHA512 for more vague safety margin) would require many fewer syscalls, and if you want to prioritize security, I'm not sure what mixing in SHA1PRNG accomplishes... if secureInstance is compromised, SHA1PRNG isn't going to help, and if it's not compromised, you don't need SHA1PRNG. Anyway, I don't mean to turn this into a referendum on algorithm design because the bigger problem is that it doesn't work on Linux, but this ad-hoc mixing is uncomfortably novel to me. Did a cryptographer review this design? Security properties don't compose like most other things: it's perilously easy to take two secure concepts and mix them together in an insecure way.

On the other side both PRNGSs are pseudo-secure. So mixing "secure" and "insecure" bytes is a concession to produce more bytes.

A PRNG is not "pseudo secure", it is "pseudo random" in that it is not hooked up to a quantum effect or other source of "true randomness". The PRNGs used by the OS for /dev/urandom, etc, are "CSPRNG"s: cryptographically secure pseudo random number generators.

Regarding currentTimeMillis: it is faster that nanoTime() and backward-forward jumps is not a big problem. I guess your concern is backward leap. In this case nanoTime will simply freeze for long so there is no big difference. Perhaps we can check abs(delta) to handle negatives as well.

I was curious about this, so I made a simple JMH benchmark, and while performance is similar for both, nanoTime is if anything both slightly faster and with lower variance. YMMV, of course. Anyway, I'm not sure what you mean about nanoTime; it shouldn't freeze when a DST transition occurs, as it measures relative elapsed time.

Also note that we do time checks only sometimes so for sure it's not a performance issue. Putting values to a channel could be even more expensive as we do it for every nonce.

Agreed, the timestamps are of marginal interest. It would probably be fine to simply reseed every million nonces or something like that.

cy6erGn0m commented 5 years ago

Workarounded in 1.0.0-beta-3 by switching to /dev/urandom for now

ktorio / ktor

Nonce generation consumes system entropy too aggressively #675

Test case

Background

Potential solutions