nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
105.24k stars 28.51k forks source link

Use system random generator in crypto #5798

Closed speakeasypuncture closed 6 years ago

speakeasypuncture commented 8 years ago

randomBytes uses OpenSSL as its random number generator. It would be wiser and less errorprone to use a system RNG like urandom on Unix platforms, getrandom syscall on Linux and CryptGenRandom on Windows.

MylesBorins commented 8 years ago

I am particularly enjoying the idea that this issue is bubbling all the way down to making (u) random faster in the Linux kernel

ircmaxell commented 8 years ago

@bnoordhuis

That's not true. /dev/urandom is not the only source of entropy, especially when you have egd installed. Even when you don't, there's still a few extra bits of entropy gained by mixing in the results from getpid(), gettimeofday(), etc.

First off, getpid() and gettimeofday() do not provide sufficient randomness for CS purposes. And they only provide one-time entropy for seeding purposes (and like 4 bits of entropy at that).

As far as EGD, that gets its entropy from the kernel. And yes, /dev/urandom is not the only device to access kernel managed entropy. But the fact that other methods exist does not mean that relying on a user-space CSPRNG is a good thing. Especially when you consider that a compromise of the kernel is by definition a compromise of the child processes.

So while there are benefits to kernel space CSPRNG (no userland compromise will leak state), there are no benefits of user space CSPRNG (since any compromise it would defend against by definition it would be exposed to, therefore it can't defend against them).

If OpenSSL switches, we switch - what's so difficult to understand here?

If that's the position of the project, that's your prerogative (all we can do is recommend and try to show you why we believe that stance is wrong). However, based on this thread, it seems like that's not really the position of the project but just your personal view (unless I'm missing something).

ChALkeR commented 8 years ago

Ok, so, one of the possible drawbacks that has been mentioned in https://github.com/openssl/openssl/issues/898#issuecomment-199355257 is /dev/urandom being about 50x slower than AES-128-CTR_DRBG (numbers mentioned there are ~160 Mbps and ~8 Gbps). Not sure now how does it perform compared to whatever OpenSSL uses internally, though.

Is that critical for us? If yes, how would this proposal deal with it?

paragonie-scott commented 8 years ago

Is that critical for us?

No.

how would this proposal deal with it?

If OpenSSL 1.1 (or later) adopts a RAND_sys_bytes() function as described in the other thread, use that instead of RAND_bytes() and call this closed.

ChALkeR commented 8 years ago

Update: actually, crypto.randomBytes() gives 175 Mbps on my PC, while /dev/urandom gives 157 Mbps.

Those are pretty close, so perhaps switching to /dev/urandom internally won't have a significant performance impact. This has to be tested when implemented, though, and on various platforms. If we care for crypto.randomBytes() throughput, that is.

paragonie-scott commented 8 years ago

If we care for crypto.randomBytes() throughput, that is.

Within reason, you should. If you only need to grab 32 bytes in under a millisecond, an upper limit on bandwidth measured in MB/s vs GB/s isn't likely to make a difference. Conversely, a difference that causes an app to take an extra 0.5s per execution would be a dealbreaker for most people.

trevnorris commented 8 years ago

I anticipate it will be swiftly closed as WONTFIX.

You must feel silly.

Update: actually, crypto.randomBytes() gives 175 Mbps on my PC, while /dev/urandom gives 157 Mbps.

Can you share how you performed this test? e.g. dd vs node. As I note below, my throughput numbers are very different.

If you only need to grab 32 bytes in under a millisecond, an upper limit on bandwidth measured in MB/s vs GB/s isn't likely to make a difference. Conversely, a difference that causes an app to take an extra 0.5s per execution would be a dealbreaker for most people.

If you need a one-off then grabbing 32 bytes every now and again is fine. My machine can do that in a matter of microseconds using randomBytes(), but as soon as I ask it to grab 512MB of consecutive 32 byte performance drops to over 4ms for every 32 byte packet. I've profiled it to make sure node wasn't the cause and sha1_block_data_order_ssse3 was eating up 25% of performance. Haven't taken the time to dig deep and fully assess where every tick is being spent.

Point here is that throughput is important, and that getting consistent high throughput is more complex than it appears. Relying solely on /dev/urandom would cause a huge performance regression on my machine. Currently I can do ~300 Mbps using randomBytes() gathering 1 Mb chunks, but running:

$ time sudo < /dev/urandom dd bs=1048576 count=10 > /dev/null

and I get a mere 96 Mbps (yes I attempted variations and this was the max throughput I could get).

So can we provide a solid argument for switching implementations if the performance impact would be this great (as in can anyone empirically prove that security will improve with the new implementation, or is it mixed hypothetical at this point)? Or should this discussion be tabled until /dev/urandom performance has been improved?


Here's the quick benchmark I threw together that has several variations:

'use strict';

const crypto = require('crypto');
const print = process._rawDebug;
const SIZE = 1024 * 1024 * 512;

var t = process.hrtime();
crypto.randomBytes(SIZE);
printTime();

//const CHUNK = 32;
//var t = process.hrtime();
//(function runner(n) {
  //if ((n + CHUNK) >= SIZE)
    //return printTime();
  //crypto.randomBytes(CHUNK, () => runner(n + CHUNK));
//}(0));

//var t = process.hrtime();
//for (var i = 0; i < 16777216; i++) {
  //crypto.randomBytes(32);
//}
//t = process.hrtime(t);
//print(((t[0] * 1e3 + t[1] / 16777216) / 1e4).toFixed(3) + ' ms/32 bytes');

function printTime() {
  t = process.hrtime(t);
  const mbps = (SIZE / 1024 / 1024) * 8 / (t[0] + t[1] / 1e9);
  print(mbps + ' Mbps');
}
ChALkeR commented 8 years ago

@trevnorris

  1. Just use dd if=/dev/urandom of=/dev/null status=progress. No sudo, no time. It prints the results in MB/s, also bytes and seconds.
  2. Your test shows 155 Mbps here. I used a similar one, but callback-based, 100 chunks, each 1 MiB in size, directly in repl:

    var start = process.hrtime(); var next = (i) => crypto.randomBytes(1024 * 1024, () => { if (i <= 0) { var end = process.hrtime(start); console.log(8 * 100 / (end[0] + end[1] * 1e-9)); } else { next(i - 1); } } ); next(100);
alfiepates commented 8 years ago

(Note: A previous partial version of this was sent out by email, please ignore that. Fuckin' GitHub.)

Hey, I'm gonna chime in for the last time on this thread:

I'm not going to name any names, but the tone of discourse in this thread (and in some cases externally, I've seen some... interesting tweets) risks becoming less than civil. While I'm not trying to derail the conversation towards "how2discuss", I think it's important we keep in mind that we (in my opinion) should have the same common goal: ensuring node is as secure as possible.

Security is in the fairly unique position that compromises are rarely acceptable; While mediocre UI decisions will just lead to a bad user experience, or scrappy optimisation can cause a process to run longer or peg some cores, bad security leads to real life risk. To be frank: Shit security gets people killed. Node.js is deployed in situations where security is important, and therefore Node.js developers have a responsibility to ensure that security is up to scratch.

This isn't about egos, this isn't about "I don't like how you're doing XYZ", this isn't about any of that. This is about risks that come with real-life consequences. There are people commenting in this thread who do security as their primary full-time employment, and who are damn good at it. I'd encourage all of you to look into the backgrounds of those in this thread making suggestions, not as an excuse to discredit their suggestions, but to put a human being behind these words.

I'm gonna duck out of this thread since I don't have all too much to add implementation-wise, but I am more than willing (at a later date) to talk about how to make sure these discussions stay on track in the future.

I appreciate what all of you are doing.

bnoordhuis commented 8 years ago

Update: actually, crypto.randomBytes() gives 175 Mbps on my PC, while /dev/urandom gives 157 Mbps.

@ChALkeR Try it with 3 or 4 instances running concurrently. A big issue with /dev/urandom is that it's a shared resource; it can produce output at a certain rate but that rate gets divided by the number of consumers.

Also, /dev/urandom has seen some scalability improvements over the years but we support kernels all the way back to 2.6.18.

trevnorris commented 8 years ago

@ChALkeR

Just use dd if=/dev/urandom of=/dev/null status=progress

Unfortunately status=progress isn't available. What kernel/dd version you running?

Your test shows 155 Mbps here. I used a similar one, but callback-based, 100 chunks, each 1 MiB in size

I'm not doubting you do, but I'm also letting you know that on my box I'm able to generate at ~300-400 Mbps. Demonstrating it'll be more complex to generate reliable throughput benchmarks, and that we may run the risk of making it slow enough that it won't be used.

When users read blog posts stating Math.random()'s PRNG has been fixed they may instead start doing things like the following:

const buf = Buffer(SIZE);
var idx = 0;
while ((idx = buf.writeUInt32LE(Math.random() * (-1 >>> 0), idx)) < SIZE);

Which, by the way, generates data at 1300 Mbps.

trevnorris commented 8 years ago

Ironically the above while loop tests well on randomness. Using dieharder on a 512MB file generated from the while loop above, and from output of /dev/urandom. Following is a table with both results. The first set is from the while loop above. The second is from /dev/urandom.

#=============================================================================#
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
#=============================================================================#
   rng_name         ||     filename                                  ||           filename
 file_input_raw     ||      /tmp/random_data.bin                     ||            /tmp/urandom_data.bin
#==============================================================================================================================#
        test_name   ||ntup| tsamples |psamples|  p-value |Assessment ||ntup| tsamples |psamples|  p-value |Assessment
#==============================================================================================================================#
     dab_bytedistrib||   0|  51200000|       1|0.09731279|  PASSED   ||   0|  51200000|       1|0.19394655|  PASSED
             dab_dct|| 256|     50000|       1|0.76785422|  PASSED   || 256|     50000|       1|0.70144164|  PASSED
       dab_filltree2||   0|   5000000|       1|0.21736330|  PASSED   ||   0|   5000000|       1|0.58799138|  PASSED
       dab_filltree2||   1|   5000000|       1|0.94496905|  PASSED   ||   1|   5000000|       1|0.97227096|  PASSED
        dab_filltree||  32|  15000000|       1|0.49546955|  PASSED   ||  32|  15000000|       1|0.62787597|  PASSED
        dab_filltree||  32|  15000000|       1|0.91901131|  PASSED   ||  32|  15000000|       1|0.80891168|  PASSED
        dab_monobit2||  12|  65000000|       1|0.69102119|  PASSED   ||  12|  65000000|       1|0.05164031|  PASSED
    diehard_2dsphere||   2|      8000|     100|0.45060252|  PASSED   ||   2|      8000|     100|0.45652347|  PASSED
    diehard_3dsphere||   3|      4000|     100|0.27961199|  PASSED   ||   3|      4000|     100|0.49351845|  PASSED
   diehard_birthdays||   0|       100|     100|0.48110599|  PASSED   ||   0|       100|     100|0.19910323|  PASSED
   diehard_bitstream||   0|   2097152|     100|0.99536763|   WEAK    ||   0|   2097152|     100|0.79115717|  PASSED
diehard_count_1s_byt||   0|    256000|     100|0.20484695|  PASSED   ||   0|    256000|     100|0.78195702|  PASSED
diehard_count_1s_str||   0|    256000|     100|0.75812248|  PASSED   ||   0|    256000|     100|0.54848822|  PASSED
       diehard_craps||   0|    200000|     100|0.23976205|  PASSED   ||   0|    200000|     100|0.24913827|  PASSED
       diehard_craps||   0|    200000|     100|0.97521323|  PASSED   ||   0|    200000|     100|0.99211351|  PASSED
      diehard_operm5||   0|   1000000|     100|0.75459685|  PASSED   ||   0|   1000000|     100|0.87415332|  PASSED
 diehard_parking_lot||   0|     12000|     100|0.72516127|  PASSED   ||   0|     12000|     100|0.51744758|  PASSED
  diehard_rank_32x32||   0|     40000|     100|0.04089185|  PASSED   ||   0|     40000|     100|0.55049357|  PASSED
    diehard_rank_6x8||   0|    100000|     100|0.76009470|  PASSED   ||   0|    100000|     100|0.86409689|  PASSED
        diehard_runs||   0|    100000|     100|0.83761343|  PASSED   ||   0|    100000|     100|0.08393408|  PASSED
        diehard_runs||   0|    100000|     100|0.96494601|  PASSED   ||   0|    100000|     100|0.79795717|  PASSED
     diehard_squeeze||   0|    100000|     100|0.21902504|  PASSED   ||   0|    100000|     100|0.45034347|  PASSED
 marsaglia_tsang_gcd||   0|  10000000|     100|0.00061302|   WEAK    ||   0|  10000000|     100|0.00000000|  FAILED
 marsaglia_tsang_gcd||   0|  10000000|     100|0.74541958|  PASSED   ||   0|  10000000|     100|0.91843104|  PASSED
     rgb_kstest_test||   0|     10000|    1000|0.55923481|  PASSED   ||   0|     10000|    1000|0.01354922|  PASSED
      rgb_lagged_sum||   0|   1000000|     100|0.36751921|  PASSED   ||   0|   1000000|     100|0.93757630|  PASSED
rgb_minimum_distance||   0|     10000|    1000|0.00000000|  FAILED   ||   0|     10000|    1000|0.00000000|  FAILED
    rgb_permutations||   5|    100000|     100|0.30443228|  PASSED   ||   5|    100000|     100|0.64178140|  PASSED
         sts_monobit||   1|    100000|     100|0.85916197|  PASSED   ||   1|    100000|     100|0.98321093|  PASSED
            sts_runs||   2|    100000|     100|0.96656242|  PASSED   ||   2|    100000|     100|0.37000645|  PASSED
          sts_serial||  10|    100000|     100|0.17108872|  PASSED   ||  10|    100000|     100|0.78291929|  PASSED
          sts_serial||  10|    100000|     100|0.43352488|  PASSED   ||  10|    100000|     100|0.80215761|  PASSED
          sts_serial||   1|    100000|     100|0.85916197|  PASSED   ||   1|    100000|     100|0.98321093|  PASSED
          sts_serial||  11|    100000|     100|0.56250044|  PASSED   ||  11|    100000|     100|0.46751929|  PASSED
          sts_serial||  11|    100000|     100|0.96681711|  PASSED   ||  11|    100000|     100|0.72513734|  PASSED
          sts_serial||  12|    100000|     100|0.80290859|  PASSED   ||  12|    100000|     100|0.71128922|  PASSED
          sts_serial||  12|    100000|     100|0.97395930|  PASSED   ||  12|    100000|     100|0.84596100|  PASSED
          sts_serial||  13|    100000|     100|0.05460794|  PASSED   ||  13|    100000|     100|0.61265757|  PASSED
          sts_serial||  13|    100000|     100|0.06000085|  PASSED   ||  13|    100000|     100|0.68475462|  PASSED
          sts_serial||  14|    100000|     100|0.24213451|  PASSED   ||  14|    100000|     100|0.22612266|  PASSED
          sts_serial||  14|    100000|     100|0.43303295|  PASSED   ||  14|    100000|     100|0.34819106|  PASSED
          sts_serial||  15|    100000|     100|0.70772349|  PASSED   ||  15|    100000|     100|0.38347196|  PASSED
          sts_serial||  15|    100000|     100|0.94036403|  PASSED   ||  15|    100000|     100|0.79931573|  PASSED
          sts_serial||  16|    100000|     100|0.18413544|  PASSED   ||  16|    100000|     100|0.21397317|  PASSED
          sts_serial||  16|    100000|     100|0.51409929|  PASSED   ||  16|    100000|     100|0.28934397|  PASSED
          sts_serial||   2|    100000|     100|0.86549774|  PASSED   ||   2|    100000|     100|0.34830172|  PASSED
          sts_serial||   3|    100000|     100|0.61116626|  PASSED   ||   3|    100000|     100|0.55936667|  PASSED
          sts_serial||   3|    100000|     100|0.81261243|  PASSED   ||   3|    100000|     100|0.91339662|  PASSED
          sts_serial||   4|    100000|     100|0.04777919|  PASSED   ||   4|    100000|     100|0.10701052|  PASSED
          sts_serial||   4|    100000|     100|0.67123232|  PASSED   ||   4|    100000|     100|0.14444634|  PASSED
          sts_serial||   5|    100000|     100|0.17241503|  PASSED   ||   5|    100000|     100|0.03073699|  PASSED
          sts_serial||   5|    100000|     100|0.73980051|  PASSED   ||   5|    100000|     100|0.86069674|  PASSED
          sts_serial||   6|    100000|     100|0.61426714|  PASSED   ||   6|    100000|     100|0.73662496|  PASSED
          sts_serial||   6|    100000|     100|0.61541617|  PASSED   ||   6|    100000|     100|0.74861872|  PASSED
          sts_serial||   7|    100000|     100|0.37869368|  PASSED   ||   7|    100000|     100|0.18601122|  PASSED
          sts_serial||   7|    100000|     100|0.47181049|  PASSED   ||   7|    100000|     100|0.86052350|  PASSED
          sts_serial||   8|    100000|     100|0.00960323|  PASSED   ||   8|    100000|     100|0.55658027|  PASSED
          sts_serial||   8|    100000|     100|0.34149403|  PASSED   ||   8|    100000|     100|0.99697626|   WEAK
          sts_serial||   9|    100000|     100|0.03105746|  PASSED   ||   9|    100000|     100|0.04533251|  PASSED
          sts_serial||   9|    100000|     100|0.93545391|  PASSED   ||   9|    100000|     100|0.12295836|  PASSED

Even tested well on cacert.org's rng results (search for "v8 4.9.385 Math.random()"). Maybe we should just switch to that. :-P

ChALkeR commented 8 years ago

@trevnorris That would be just another userspace PRNG which would be an extra signle point of failure on top of the system PRNG…

I have not analyzed your dieharder results yet.

dd --versions says «dd (coreutils) 8.25». If you don't have status=progress, just launch watch killall -USR1 dd in a separate tab — it will send USR1 signals to dd every 2 seconds, and dd prints progress when receives an USR1 signal.

trevnorris commented 8 years ago

@ChALkeR I'm using 8.23, and nice trick.

That would be just another userspace PRNG which would be an extra signle point of failure on top of the system PRNG…

Sorry. :-P was meant to imply /s or :trollface: and that I was surprised Math.random() would have scored such high marks on a rigorous randomness analysis test. And to suggest people may see results like this and feel justified choosing the above while loop over randomBytes() because our implementation is so much slower.

joepie91 commented 8 years ago

It's not uncommon for non-CS PRNGs to score well on randomness analysis. That doesn't make them suitable for crypto however - for that, they don't just have to be indistinguishable from randomness, but also unpredictable and not subvertible. On Mar 23, 2016 05:50, "Trevor Norris" notifications@github.com wrote:

@ChALkeR https://github.com/ChALkeR I'm using 8.23, and nice trick.

That would be just another userspace PRNG which would be an extra signle point of failure on top of the system PRNG…

Sorry. :-P was meant to imply /s or [image: :trollface:] and that I was surprised Math.random() would have scored such high marks on a rigorous randomness analysis test. And to suggest people may see results like this and feel justified choosing the above while loop over randomBytes() because our implementation is so much slower.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/nodejs/node/issues/5798#issuecomment-200192891

paragonie-scott commented 8 years ago

Humor is good, but unless the calendar reads April 1, please let's keep the discussion of "let's use Math.random() for crypto" to a minimum.

trevnorris commented 8 years ago

Humor is good, but unless the calendar reads April 1, please let's keep the discussion of "let's use Math.random() for crypto" to a minimum.

Loosen up. There was no serious conversation about using it in core. My point is that if our implementation is so slow it's unusable then users will start resorting to methods like this. That I'm serious about, and history has proven users will take this type of action if our API doesn't work as needed and there's no alternative. Don't rebuttal that it's on them, because it's our responsibility to make sure our API is acceptable.

It's not uncommon for non-CS PRNGs to score well on randomness analysis.That doesn't make them suitable for crypto however - for that, they don't just have to be indistinguishable from randomness, but also unpredictable and not subvertible.

Then please share how randomness can be properly analyzed to see if it's suitable for cryptographic use.

technion commented 8 years ago

It's worth referring to LibreSSL's rewrite of OpenSSL's functionality. The first thing their getentropy() function tries (under Linux) is to pull from /dev/urandom:

https://github.com/libressl-portable/openbsd/blob/37759485f52993eb812898ccd70fc42cd92cdad5/src/lib/libcrypto/crypto/getentropy_linux.c#L114

paragonie-scott commented 8 years ago

Apparently OpenSSL bit Android too.

joepie91 commented 8 years ago

Looks like a new paper was released regarding OpenSSL's PRNG, a few days ago:

In this work we demonstrate various weaknesses of the random number generator (RNG) in the OpenSSL cryptographic library. We show how OpenSSL's RNG, knowingly in a low entropy state, potentially leaks low entropy secrets in its output, which were never intentionally fed to the RNG by client code, thus posing vulnerabilities even when in the given usage scenario the low entropy state is respected by the client application. Turning to the core cryptographic functionality of the RNG, we show how OpenSSL's functionality for adding entropy to the RNG state fails to be effectively a mixing function. [...]

bnoordhuis commented 8 years ago

@joepie91 I've scanned through the paper. The main premise seems to be that starting from a low entropy state reduces the efficacy of the PRNG? That seems self-evidently true. Node.js calls RAND_poll() at start-up so I think we're immune from that (provided system entropy is reliable.)

I don't really understand his other finding on reducing the entropy from 256 to 240 bits. It relies on the attacker being able to observe the PRNG's intermediate states?

EDIT: Or is the issue that RAND_bytes() uses the pre-existing contents of the output buffer as input for the entropy pool? We disable that in our builds.

paragonie-scott commented 8 years ago

I see this didn't make into Node 6. Maybe we could push for a true CSPRNG in Node 7?

ChALkeR commented 8 years ago

@paragonie-scott

Btw, this comment by @trevnorris:

My point is that if our implementation is so slow it's unusable then users will start resorting to methods like this.

is similar to reasons of using /dev/urandom instead of /dev/random, see http://www.2uo.de/myths-about-urandom/ «What's wrong with blocking?».

The reasons not to make it slower here are pretty similar. While 20% difference would likely be acceptable, making it several times slower would probably be viewed as a regression, and users would resort to various hacks.

ChALkeR commented 8 years ago

@paragonie-scott

I see this didn't make into Node 6. Maybe we could push for a true CSPRNG in Node 7?

I don't think platform detection here for randomness source will be agreed on, so this should probably be first pushed up into one of the deps that handles platform abstraction for similar things — e.g. OpenSSL or libuv.

trevnorris commented 8 years ago

I'm still waiting on a response of how to determine if data is "suitable for crypto". Since passing marks on randomness tests apparently aren't good enough.

alfiepates commented 8 years ago

@trevnorris The NIST has published a paper on this very topic, and maintains a somewhat easy-to-digest webpage covering CSRNG.

But, in my opinion, this should not concern you, since (sensibly implemented) system randomness is designed specifically to abstract this away.

However yes, I would somewhat agree with @ChALkeR in that the platform abstraction should probably be handled by a separate dependency, but that's also something I'm not entirely qualified to comment on so this is where my input ceases.

trevnorris commented 8 years ago

@alfiepates Thanks for the links. The point is that I'm genuinely curious how Math.random() scores. Nothing actually important. :)

indutny commented 8 years ago

@trevnorris Math.random() is absolutely predictable after observing several values.

trevnorris commented 8 years ago

@indutny sure, i'm just interested to see that for myself after it scored so high on the dieharder tests.

ircmaxell commented 8 years ago

There is no test that will prove if a given function is random. There are only tests that prove it is not random.

Meaning, no test will tell you if you are OK. That's important to keep in mind when talking about crypto and random numbers (specifically CSPRNG).

joepie91 commented 8 years ago

@trevnorris It's important to realize that a CSPRNG shouldn't just be resistant to analysis, but also to targeted attempts at manipulation (eg. by 'poisoning' source entropy). No randomness test will show this. Determining whether random data is 'good enough' is largely a matter of theory.

@bnoordhuis I haven't actually read the paper yet due to time constraints, rather I just figured I'd post it in this thread for its relevance.

carlos8f commented 8 years ago

Just a thought, if someone wanted to be malicious and had publish access to someone's dependency, they could quietly monkey patch:

require('crypto').randomBytes = () => Buffer('pawnd')

and ruin everyone's crypto in one line. maybe there should be a protection against this?

joepie91 commented 8 years ago

@carlos8f Hmm. If that works, that'd work with any dependency, and you wouldn't even need to compromise a third-party one (due to the require cache).

indutny commented 8 years ago

@carlos8f what about running require('child_process').spawn from dependency? Seems to be quite a serious security compromise too. IMO, there is certain trust that people put into dependencies that they install, this is not something that core is responsible for.

carlos8f commented 8 years ago

@indutny true, but that would be more noticeable of an attack. monkey patching core crypto module shouldn't be allowed imo. at the very least a warning should spit out to the console.

indutny commented 8 years ago

@carlos8f well, having access to require('fs') is enough to patch node.js binary...

carlos8f commented 8 years ago

@indutny yep, I tried to release a crypto node program the other day and just got laughed at. no one takes node security seriously, especially coming from the security community.

PS i'm trying to revive the conversation on package signing, but seeing crap like this i quietly walk away....

indutny commented 8 years ago

@carlos8f It's not just about node, it is about java, go, python, ruby and almost anything that has a package managers. Any code that wasn't written by company's members should be considered as insecure, unless proven otherwise.

carlos8f commented 8 years ago

A safer bet would be to adopt what PHP 7's random_bytes() does:

LOL that we're in the year 2016, and PHP is what we're aspiring to be like. And at least in PHP, you can't randomly redefine mission-critical parts of the system like random_bytes() without getting a proper exception in production. node could have better security, but it choses not to, because security is always someone else's problem. sadly companies don't audit their code, but that's why we hackers stay in business muahaha.

bnoordhuis commented 8 years ago

monkey patching core crypto module shouldn't be allowed imo.

There are legitimate use cases, like instrumentation and performance monitoring.

I tried to release a crypto node program the other day and just got laughed at. no one takes node security seriously, especially coming from the security community.

I wouldn't call /r/crypto a security community. Maybe "a bunch of of idiots that know just enough to be dangerous."

In fairness, that applies to most technology-related subreddits with more than a few hundred subscribers. The larger they get, the more they regress to the mean.

trevnorris commented 8 years ago

No randomness test will show this. Determining whether random data is 'good enough' is largely a matter of theory.

This nags at me. Having a proper random generator is so important, yet I've not yet been able to find a way to solidly empirically test an implementation. Which then makes the "good enough" line subjective and never ending (e.g. this thread).

llafuente commented 8 years ago

monkey patching core crypto module shouldn't be allowed imo.

There are legitimate use cases, like instrumentation and performance monitoring.

Anyone can freeze that object, so like @bnoordhuis I consider this a job for the user.

I think node could have something to freeze all it's security-concern-objects/modules/***, and prevent unwanted modifications... process.freeze()?

alfiepates commented 8 years ago

@trevnorris

This nags at me. Having a proper random generator is so important, yet I've not yet been able to find a way to solidly empirically test an implementation. Which then makes the "good enough" line subjective and never ending (e.g. this thread).

Which is why I linked the NIST paper. It's quite hard to tell if a set of numbers is random if all you have is that set of numbers, but if you give the method by which the set of numbers is generated, you can start to decide if you're getting randomness or not.

Sure, there's no "magic randomness test box", but the NIST paper does define a set of rules to test CSPRNG implementations.

joepie91 commented 8 years ago

@trevnorris: This nags at me. Having a proper random generator is so important, yet I've not yet been able to find a way to solidly empirically test an implementation. Which then makes the "good enough" line subjective and never ending (e.g. this thread).

Aside from what @alfiepates said, this is really not any different from security in general, however much it sucks (and yes, I agree that this is irritating). In real-world code, there are no proofs that code is secure, definitely not through empirical testing - you can only keep trying to find weaknesses, and fix them as soon as you find them, even if they seem small or insignificant. "Vulnerability scanners" are famously useless for this reason.

It's the same reason why for both CSPRNGs and other tooling, the general advice is "use the battle-tested thing that isn't known to be broken", as it is more likely to have approached 'perfect security' than the non-battle-tested options, assuming no explicitly contradictory information is available. It's all you can do, really, and it has to be done rigorously.

joepie91 commented 8 years ago

@llafuente: Anyone can freeze that object, so like @bnoordhuis I consider this a job for the user.

How do you ensure that this happens before anything else gets a chance to tamper with it, though? You'd have to expect the user to explicitly do this at the top level of their application code, and that is going to inevitably be forgotten.

I'd rather see it frozen by default (at least to the point of not allowing to modify existing properties, since eg. promisifying might be necessary), with a runtime flag to allow certain files or modules to bypass this. That would cover instrumentation and performance monitoring, both of which are things that are typically explicitly enabled on runtime in a specific environment.

bnoordhuis commented 8 years ago

That's fake security. One obvious way to circumvent it is to spawn gdb or open /proc/self/mem and flip the protection bits. I'm sure you can think of more.

We're getting off-topic, though.

azet commented 8 years ago

FYI: similar discussion in Ruby - https://bugs.ruby-lang.org/issues/9569

Userspace RNGs are dangerous. Let me reiterate what a couple of people already stated: on Linux you want to seed from /dev/urandom! Ideally you'd go the way e.g. libsodium handles RNGs on different platforms, but I'm not going to dictate anything, reading this thread I wasn't sure if I should laugh or cry.

One core maintainer even suggested "adding entropy" via egd. Something I can only describe as "voodoo security". I actually had to Google egd, because I didn't remember:

A userspace substitute for /dev/random, written in perl. [...] egd-0.6 had a major security problem that caused it to only use a tiny fraction of the entropy it gathered. All users should upgrade immediately.

Then I remembered why it exists: a long time ago -- before single threaded frontend languages were hip to write security sensitive "microservices" in -- SunOS (i.e. Solaris) lacked a kernel supplied random number generator. I remember I couldn't compile OpenSSH on this thing because of it and had to resort to some hacked-up rng/entropy gathering daemon back then. I decided to use telnet(1) instead. Which in retrospect was a wise choice, because I would have only added security by obscurity. And didn't connect the machine to the Internet. I was ~14 back then and I liked SPARC a lot. There's one in Rexx as well if you still run OS/2 or a Mainframe to develop useless node dependencies: http://r6.ca/RexxEGD/

As several people have already told you: the RNG in OpenSSL isn't fork safe, since this language depends on forking for performance, it's rather unwise to use a Userspace RNG that has known weaknesses. The recent paper on RNG weaknesses in OpenSSL was linked to in the thread already, but again: https://eprint.iacr.org/2016/367.pdf as it seems @bnoordhuis didn't read beyond the abstract or just didn't grasp it's meaning. Yes that'll affect node security. I think it's been fixed (which doesn't mean this release won't be around for ages in legacy systems,..) - yet it demonstrates how a Userspace RNG can fail to provide proper security for a language as a whole. Suggesting user-land entropy gathering daemons along is just pure insanity. There're so many attack vectors. So you either believe the Kernel you're running on is working properly (otherwise why would you run language X on it anyway - remember Ken Thompson's Turing Award Lecture? If you develop on e.g. a garbage collected, dynamically interpreted language - the same holds true for the OS you're running on. If you compile, in this case via multiple steps, you don't only have to trust your compilers and linkers, you have to trust the operating system they're running on and it's libc as well [btw also the klibc]) and may think about using additional security fallbacks like implemented by libsodium or you're relying on voodoo-security for your entire ecosystem.

To be honest: I don't really care about node.js and every time I need to install some service that depends on five different module/package managers and installs more than 200 dependencies I'm asking myself why I'm even bothering. But I know people run node in production, so it may be wise to think about security sensitive issues in the core language and library.

bnoordhuis commented 8 years ago

A userspace substitute for /dev/random, written in perl.

I think you are mistaking the program for the protocol.

the RNG in OpenSSL isn't fork safe, since this language depends on forking for performance

What on earth makes you say that? Please get your facts straight. The only time node forks, it's to call execve() immediately afterwards.

The recent paper on RNG weaknesses in OpenSSL was linked to in the thread already, but again: https://eprint.iacr.org/2016/367.pdf as it seems @bnoordhuis didn't read beyond the abstract or just didn't grasp it's meaning.

How about you explain it instead of making disparaging comments? Besides the tone of your post, the factual errors make me reluctant to take it on face value.

azet commented 8 years ago

I think you are mistaking the program for the protocol.

Which protocol? Are you referring to https://www.openssl.org/docs/manmaster/crypto/RAND_egd.html? If so, again; why would you want to add entropy from a userspace source? What exactly are the use-cases?

What on earth makes you say that? Please get your facts straight. The only time node forks, it's to call execve() immediately afterwards.

Well. It's not a traditional fork. child_process has the following methods:

These seem to be tracked via PIDs (are these internal to node.js or actual OS PIDs?). I'd assume the latter given a subshell is openend from another process, but I'm certainly not as familiar with node.js internals as you are. libuv uses OS PIDs if I remember correctly.

The documentation also contains a waitpid method.

Further down the line, I read a bit of nodejs core code: https://github.com/nodejs/node/blob/master/src/node_crypto.cc#L234-265

// Ensure that OpenSSL has enough entropy (at least 256 bits) for its PRNG.
// The entropy pool starts out empty and needs to fill up before the PRNG
// can be used securely.  Once the pool is filled, it never dries up again;
// its contents is stirred and reused when necessary.
//
// OpenSSL normally fills the pool automatically but not when someone starts
// generating random numbers before the pool is full: in that case OpenSSL
// keeps lowering the entropy estimate to thwart attackers trying to guess
// the initial state of the PRNG.
//
// When that happens, we will have to wait until enough entropy is available.
// That should normally never take longer than a few milliseconds.
//
// OpenSSL draws from /dev/random and /dev/urandom.  While /dev/random may
// block pending "true" randomness, /dev/urandom is a CSPRNG that doesn't
// block under normal circumstances.
//
// The only time when /dev/urandom may conceivably block is right after boot,
// when the whole system is still low on entropy.  That's not something we can
// do anything about.
inline void CheckEntropy() {
  for (;;) {
    int status = RAND_status();
    CHECK_GE(status, 0);  // Cannot fail.
    if (status != 0)
      break;

    // Give up, RAND_poll() not supported.
    if (RAND_poll() == 0)
      break;
  }
}

If you remember the abstract of the paper referenced earlier, that exactly the issue at hand:

Abstract: In this work we demonstrate various weaknesses of the random number generator (RNG) in the OpenSSL cryptographic library. We show how OpenSSL's RNG, knowingly in a low entropy state, potentially leaks low entropy secrets in its output, which were never intentionally fed to the RNG by client code, thus posing vulnerabilities even when in the given usage scenario the low entropy state is respected by the client application. Turning to the core cryptographic functionality of the RNG, we show how OpenSSL's functionality for adding entropy to the RNG state fails to be effectively a mixing function. If an initial low entropy state of the RNG was falsely presumed to have 256 bits of entropy based on wrong entropy estimations, this causes attempts to recover from this state to succeed only in long term but to fail in short term. As a result, the entropy level of generated cryptographic keys can be limited to 80 bits, even though thousands of bits of entropy might have been fed to the RNG state previously. [...]

https://github.com/nodejs/node/blob/master/src/node_crypto.cc#L268-275:


bool EntropySource(unsigned char* buffer, size_t length) {
  // Ensure that OpenSSL's PRNG is properly seeded.
  CheckEntropy();
  // RAND_bytes() can return 0 to indicate that the entropy data is not truly
  // random. That's okay, it's still better than V8's stock source of entropy,
  // which is /dev/urandom on UNIX platforms and the current time on Windows.
  return RAND_bytes(buffer, length) != -1;
}

https://www.openssl.org/docs/manmaster/crypto/RAND_bytes.html says:

RAND_bytes() returns 1 on success, 0 otherwise. The error code can be obtained by ERR_get_error. RAND_pseudo_bytes() returns 1 if the bytes generated are cryptographically strong, 0 otherwise. Both functions return -1 if they are not supported by the current RAND method.

So I think the check should be return RAND_bytes(buffer, length) == 0; - /dev/urandom is certainly a better option than a broken OpenSSL PRNG, don't you think so? I don't know if node cares about Windows but RAND_poll is incredibly slow due to a heap walk bug: https://rt.openssl.org/Ticket/Display.html?id=2100&user=guest&pass=guest

Let me conclude saying that every cryptographer and security software engineer I've talked and most languages to prefer to seed from urandom directly. Possibly making use of safe and portable constructs like that of libsodium: https://github.com/jedisct1/libsodium/blob/master/src/libsodium/randombytes/sysrandom/randombytes_sysrandom.c

In future versions of the Linux Kernel /dev/urandom will feature a ChaCha20-based DRBG construct similar to that of OpenBSD's arc4random. But these patches are just a few weeks old and there are a couple of people working on further improvements. Anyway. It's safe to use and seed from, may I ask why you're specifically using OpenSSL's RNG? Even some of their maintainers told me "do not use it". It's RNG is also not general-purpose, it's intended to be used for SSL/TLS specific protocol internals.

Recommended reading material: https://emboss.github.io/blog/2013/08/21/openssl-prng-is-not-really-fork-safe/

bnoordhuis commented 8 years ago

Which protocol? Are you referring to https://www.openssl.org/docs/manmaster/crypto/RAND_egd.html? If so, again; why would you want to add entropy from a userspace source? What exactly are the use-cases?

EGD the protocol is implemented by several programs that collect or distribute entropy, e.g., low-entropy VMs that get their initial state from a hardware RNG on the network.

Well. It's not a traditional fork.

Indeed it is not. OpenSSL not being fork-safe is not relevant because node doesn't fork.

If you remember the abstract of the paper referenced earlier, that exactly the issue at hand:

See https://github.com/nodejs/node/issues/5798#issuecomment-209326548 - if you think my understanding of the paper and why I think node is unaffected is wrong, please explain so in detail.

/dev/urandom is certainly a better option than a broken OpenSSL PRNG, don't you think so?

Context is everything. The code you pasted seeds V8's Math.random() and the hash table seed. If you thought it had anything to do with the crypto module - it does not.

I assume you are aware of this but Math.random() is not cryptographically strong, it's just a simple PRNG. The hash seed is to mitigate hash table collision attacks, primarily for string key collisions in JS objects in dictionary mode.

If node didn't override the defaults, V8 will try to open /dev/urandom but happily continues with almost zero entropy if it fails - which, per the contract of Math.random(), is perfectly acceptable.

The situation on Windows was even worse at the time of writing - processes spawned within one second of each other would generate the same random numbers - and is still not great: it uses rand_s / RtlGenRandom, which does not claim to be strongly random. Again, perfectly acceptable but not exactly great.

Knowing what you know now, wouldn't you agree that OpenSSL's RNG is better?

Let me conclude saying that every cryptographer and security software engineer I've talked and most languages to prefer to seed from urandom directly.

Seeding from /dev/urandom (and /dev/random and /dev/srandom) is exactly what OpenSSL does. Why then do you consider it so broken?

In future versions of the Linux Kernel /dev/urandom will feature a ChaCha20-based DRBG construct similar to that of OpenBSD's arc4random.

I've commented on that before but to reiterate: that doesn't help while we still support 2.6.18 and 2.6.32. Performance regressions on older kernels are not acceptable.

may I ask why you're specifically using OpenSSL's RNG? Even some of their maintainers told me "do not use it".

I'd appreciate it if you could back up that claim with names and places. I've never heard any of the OpenSSL developers publicly state that.