digitalbazaar / forge

A native implementation of TLS in Javascript and tools to write crypto-based and network-heavy webapps
https://digitalbazaar.com/
Other
5.03k stars 776 forks source link

Some serious Fortuna implementation issues #520

Open pwnbreak opened 7 years ago

pwnbreak commented 7 years ago

First of all, I have to say that this library has been very useful for me, and I appreciate the great work that you have done to implement all the algorithms. I used it to provide a JS only implementation of some subset of WebCrypto that we use in our company, for a code that was ported to run in a React-Native environment. Since WebCrypto or nodejs crypto APIs are not available, I was concerned about the crypto random number generator, wanted to make sure it was properly seeded, etc.

So I did a quick review of the PRNG implementation (Fortuna) and I found some deviations from the paper (https://www.schneier.com/academic/paperfiles/fortuna.pdf) which I don't understand (as far as I know these are mistakes). I think some are quite obvious, which makes me think that no serious review/audit of the PRNG implementation has been done. I'm not saying that the implemented PRNG is insecure, but more that the intended algorithm (Fortuna) is not implemented properly, and its security relies A LOT on crypto.getRandomValues or the equivalent NodeJS API.

My concerns:

  1. A very concrete one: in function _seedin prng.js, there is the comment: // digest the entropy of other pools whose index k meet the // condition '2^k mod n == 0' where n is the number of reseeds Please, check the paper, it's actually the other way around: 'n mod 2^k == 0'. It's not just the comment, the code is wrong. Note that this also works for bucket k = 0, since n mod 2^0 is always 0, so bucket 0 is always used.

  2. If pool 0 does not have enough collected data, or not enough time since last reseed, there should not be a reseed. But in your implementation you just force a reseed for every call to getBytesSync and if pool 0 does not have the minimum entropy you just get the remaining from the 'seed file', which I believe it's not the intended usage for this. AFAIK the seed file should only be used as the initial seed. In fact, IMHO probably it would be better if this concept was replaced by just an 'initial seed', since there is no actual file here (are you writing to it?). But instead you use this "seed file" as a magical stream of random data, which seems a bit like cheating.

  3. According to the paper reseed should be done like newkey = sha256(sha256(oldkey ||| entropy)), where entropy is the accumulated pools/buckeys entropy. This is good because the security of the new key will never decrease if the entropy is low (or even inexistent), as long as the original seed was good and previous state has not been compromised. Instead (please correct me otherwise), you just ignore the current key and generate the new key based only on the collected entropy (and the magical "seed file", if you don't have enough). This is bad, because if you don't have additional collected entropy (or magical seed file) then the security of the new key (and next generated random numbers) could be much worse than the one you had. Actually, this is what made me read the Fortuna paper, because I could not believe it was specified like this: IMHO additional entropy inputs should never decrease current key security, which means that the PRNG if properly seeded (once) should be safe even if no additional entropy is provided, as long as no adversary gets to know the current PRNG internal state (the additional entropy AFAIK is to recover from that attack, which is probably not that important for the scope of the library).

  4. What you call seed (ctx.seed) is actually the counter in Fortuna paper, right? Why do you change it on every reseed, by making it the hash of the new key? AFAIK in the paper they just keep incrementing it... Probably it's still safe, but why this arbitrary deviation from the paper?

Independently of all of this, I'm not sure about the rationale behind the decision of implementing Fortuna for the PRNG. AFAIK Fortuna does not solve the problem where you don't have good enough initial entropy (seed file). Assuming you have a good initial seed, then AFAIK the additional thing that Fortuna provides over a simpler crypto secure PRNG is that if someone compromises (steals) the current internal state, then eventually the PRNG will recover thanks to the additional streams of entropy, which will make it progressively more difficult for this adversary to guess future output. As I said before, I'm not sure if protecting against these kind of attacks is important for the scope of the library.

Not sure this is the right place for these kind of "reviews", but in any case I hope that these concerns make sense to you. If I missed something, or I'm just plain WRONG, please let me know :)

dlongley commented 7 years ago

@nizkp,

First, thanks so very much for this review! It has been sorely needed.

Regarding item 1, this was clearly a bug. PR #521 has been created to address it.

Regarding item 2, more research is needed into why a reseed happens on each call to generate bytes (which is triggered by clearing the key). I remember this being added intentionally -- based on a cryptographer's input (likely paranoia for protecting old generated random bytes should a key be discovered). We should have included more comments on this part of the implementation. Sadly, this implementation is somewhere around 8 years old at this point and the memory for why is lost. So, we need to explore this more before making a change.

Regarding item 3, a change has been made to incorporate the bytes of the previous key when reseeding.

With respect to the seed file, it is assumed that it provides immediate (and good) entropy when necessary. Fortuna provides "cheap" entropy but, when it can't (like on init), the seed file can provide (typically) more expensive entropy to reseed it so it can carry on. This is an extension point -- and the default implementation uses the native CSPRNG of node.js or the browser.

This is typically used on initialization but can be used later based on the assumptions stated. The only "magic" is simply an assumption that the user (or the default implementation) will provide good random bytes when Fortuna's pools need more entropy. This "seed file" can even be an asynchronous call out to https://random.org. But, perhaps it would be best to only use it for the initial seeding. I can't recall how or why the implementation evolved in the direction it did here, but there were various needs for these extension points over time. There were also various battles between whether avoiding exhausting the PRNG or forcing a reseed more often was a better approach given what that the default implementation now does. It does seem that there is some over conflation of a "seed file" and a backup, expensive source of entropy. So I could get behind the renaming or refactoring of this feature in the future.

Regarding item 4, the deviation from the paper should have been documented at some point. More research should be done into why the change was made in the implementation or we should switch it to match the paper. Maybe this had to do with getting a more truly random looking distribution, though I'd expect it to have to happen outside of reseeding to accomplish that. There have been various recommendations made over the 8 years since this implementation was written -- and they have been poorly documented in the code itself.

Independently of all of this, I'm not sure about the rationale behind the decision of implementing Fortuna for the PRNG. AFAIK Fortuna does not solve the problem where you don't have good enough initial entropy (seed file). Assuming you have a good initial seed, then AFAIK the additional thing that Fortuna provides over a simpler crypto secure PRNG is that if someone compromises (steals) the current internal state, then eventually the PRNG will recover thanks to the additional streams of entropy, which will make it progressively more difficult for this adversary to guess future output. As I said before, I'm not sure if protecting against these kind of attacks is important for the scope of the library.

What alternative would you have suggested for ~2009? As I recall, a few were tried back then and they failed to produce a decent distribution or could not produce an indefinite amount of data (above implementation tweaks notwithstanding). Avoiding side channel attacks, the ability for an attacker to exhaust the PRNG, and entropy estimations were considerations. We don't know how people use the library -- so we try to stick to implementing popular or fairly well vetted solutions to particular problems. Fortuna seemed to fit all this nicely.

acatarineu commented 7 years ago

Thanks for the fast and PR, and sorry for the late reply.

The PR looks good to me for items 1 and 3 (which were the ones concerning me most, especially 3).

I'm not sure what you mean by exhausting the PRNG. AFAIK an adversary that only sees the output of the PRNG would only be able to predict the next numbers if the state of the PRNG has already occurred (a cycle). This is almost impossible to happen given the size of the key.

AFAIK, if Fortuna has a good initial seed, then now with item 3 fixed the only way for an attacker to predict the numbers would be to find out the internal state of the PRNG, which I think can only happen if the attacker has access to the machine/browser/etc where the code is running. In that case, and assuming the attacker had only limited access (once) to the state, Fortuna would eventually recover from that attack by mixing the different entropy sources (that the attacker didn't see) to make the next numbers unpredictable. But if we assume that this attack is infeasible, then even if we do not collect more entropy (or all the collected entropy is just 0, 0, 0, 0...) the PRNG should still be unpredictable (because the initial seed was good).

What I said at the end was a bit like asking whether crypto.getRandomValues, or the equivalent nodejs API would be good enough (instead of Fortuna), as a good security/performance tradeoff. I didn't know the library was 8 years old, and I guess in 2009 (with no crypto.getRandomValues, etc...) a Fortuna js implementation would be more useful, but still I don't see how it would solve the the problem of having a good initial seed: if there was not enough good entropy at the beginning the first random numbers might be insecure. Seeding with Math.random seems to be the best thing when no other source of entropy is available, but since an attacker just needs to observe a couple of Math.random() numbers in the same context to be able to predict all the next ones, it's not hard for a library user to screw up. Of course with different random sources of entropy (mouse, keystrokes, times, Math.random()) Fortuna would eventually reach a safe state (good enough entropy) and be unpredictable from that moment on, but the question is what would happen before reaching that state.

davidlehn commented 7 years ago

It would be good to have some test coverage of that code. I'm not sure how to do so. One test would be to output a bunch of data and feed it into a random number analyzer and compare with other Fortuna results. What else could be tested?

dlongley commented 7 years ago

@alex-cliqz,

I'm pretty busy this week so don't have time to digest and respond to everything in your last comment, but I'll quickly say this:

What I said at the end was a bit like asking whether crypto.getRandomValues, or the equivalent nodejs API would be good enough (instead of Fortuna), as a good security/performance tradeoff...

Using crypto.getRandomValues directly is problematic on a number of browsers -- as their random entropy source gets exhausted quickly and the API starts throwing QuotaExceeded errors. Fortuna gets around this by providing an unlimited source of random bytes and only hitting crypto.getRandomValues when its first entropy pool has insufficient bytes during a reseed.

acatarineu commented 7 years ago

BTW, @alex-cliqz is my company account, it's still the same guy :) (actually I should have written first with this one).

It would be good to have some test coverage of that code. I'm not sure how to do so. One test would be to output a bunch of data and feed it into a random number analyzer and compare with other Fortuna results. What else could be tested?

I would try to find tests to make it fail for item 1 and 3. A simple test that makes 3 fail before PR is:

var random = RANDOM.createInstance();
random.seedFileSync = function(needed) {
  return UTIL.fillString('a', needed);
};

var N = 1000;
var seen = {};
for (var i = 0; i < N; i += 1) {
  var s = random.getBytes(32);
  ASSERT.equal(seen[s], undefined);
  seen[s] = true;
}

For 1 I'm not sure sure, would have to make sure the entropy pools are used correctly, perhaps checking directly the pools[i] size as we request more data, etc.