tc39 / proposal-uuid

UUID proposal for ECMAScript (Stage 1)
463 stars 7 forks source link

Math.urandom() in ECMAScript #31

Closed bcoe closed 4 years ago

bcoe commented 4 years ago

It came up both at committee, and in https://github.com/tc39/proposal-uuid/issues/25, that UUID could potentially rely on the introduction of Math.urandom() into ECMA-262.

Conceivably, this API would provide a non-blocking secure source of randomness, similar to urandom.

Some open questions to get this conversation started:

Is urandom suitable for UUIDs?

As a general rule, /dev/urandom should be used for everything except long-lived GPG/SSL/SSH keys.

Where does UUID fall on this spectrum? We would want to make sure we avoid pitfalls like those described in TIFU by using Math.random().

To answer this question with confidence, we should likely be trying to find someone in our networks who is a cryptography expert.

Should we extend this proposal to encompass Math.urandom()?

Does it make sense to flesh out this proposal to include Math.urandom(), or should we write up this proposal separately.

Given that UUID relies on having access to a source of randomness with a long cycle length, I have a preference for fleshing out both UUID and Math.urandom in this proposal.

CC: @bakkot, @sffc

ctavan commented 4 years ago

In this discussion we should also consider whether Math.urandom is a suitable term. When drawing the analogy to linux I would expect Math.random to provide better entropy than Math.urandom. However, so far it appears that Math.urandom is being proposed to overcome the shortcomings of Math.random so the opposite would be true.

bakkot commented 4 years ago

One possibility would be to copy the crypto.getRandomValues API on the web platform, name and all, to make Math.getRandomValues (or whatever).

bcoe commented 4 years ago

@bakkot I have a preference for Math.getRandomValues as well, it's an API surface that works well for this problem space ... folks can then use UUID() for many use-cases, and avoid the lower level API.

ljharb commented 4 years ago

I wonder if we could/should even change Math.random itself to call into this api?

bakkot commented 4 years ago

Historically engines have cared a lot about the performance of Math.random, and I doubt they would be willing to slow it down by making it call this API. It's possible they could be convinced to make it a CSPRNG if we didn't require that it call another API, which would also be very valuable (and would to some extent obviate the need for a new API for randomness), but even just that compromise would be a pretty big ask.

ljharb commented 4 years ago

if the builtins were unmodified, I’d expect perf to be unaffected.

bakkot commented 4 years ago

CSPRNGs are generally slower than non-CS PRNGs, even assuming engines completely optimized out the property access.

bcoe commented 4 years ago

@ljharb @bakkot does urandom have a significant performance hit to a simple PRNG? Perhaps we should just define that Math.random use a similar approach to urandom (use entropy if available, don't ever block).


Having a way to fetch randomness that's not a float is nicer for algorithms like UUIDs though, making getRandomValues a more elegant API for some applications, mind you.

bakkot commented 4 years ago

does urandom have a significant performance hit to a simple PRNG?

I believe so, yes. /dev/urandom uses a CSPRNG under the hood. (Edit: also, it depends on what you mean by "significant". For almost all practical purposes the difference is almost certainly irrelevant, but in certain benchmarks the difference might be large.)

This is perhaps a question for implementors, though.

ljharb commented 4 years ago

Then perhaps this new api could support Math.random’s current mode as well as the new ones, allowing the new method to become the sole source of non-temporal entropy?

bakkot commented 4 years ago

That seems a lot more complicated than just adding Math.getRandomValues. If you care about your randomness being cryptographically good, you almost certainly want bytes, not floats. If you don't, Math.random is fine (and faster).

ljharb commented 4 years ago

I’m thinking of ses use cases; if Math.random can be made to also depend on getRandomValues, then that becomes the only thing that needs to be virtualized.

bakkot commented 4 years ago

Eh. Virtualizing both isn't hard, as long as future APIs which need randomness (like uuid) ultimately derive it from one of the two.

ljharb commented 4 years ago

Totally true :-) just wondering if it would be possible/desirable to go the extra mile.

sffc commented 4 years ago

+1 on including a secure random bytes generator to the same proposal.

It would be good to have implementer feedback on whether making Math.random() depend on Math.getRandomValues() would come at a performance cost.

bcoe commented 4 years ago

@sffc @bakkot given that crypto.getRandomValues is a W3C proposal, what would the process look like for also defining it in ECMA-262?

Has a precedent been set for doing this in the past, with other features? e.g., something really valuable is defined in W3C, and we pull it in to ECMA-262?

broofa commented 4 years ago

I have a few concerns...

First, I think we're in "I got 99 problems" territory with this. Isn't this pretty much the exact opposite of #25, "Avoid introducing a new source of randomness"? This is going to force implementers of testing frameworks to stub out yet-another method.

Sure, this is a more standard, more "core" API but... it's still a different method.

One alternative would be to push the issues raised in #25 onto implementers by modifying the verbiage in this proposal to say something like:

All random values in UUIDs produced by this API must be generated from a cryptographically secure source. If feasible, implementations should call into existing, public RNG sources in a way that allows users to inject their own RNG implementation. (E.g. by calling WebCrypto crypto.getRandomBytes())

Second, I'm concerned about the bureaucratic impact. If Math.urandom() is the best approach (?) then, for selfish reasons, I'd prefer to see it spec'ed w/in the scope of this proposal - it will be simpler and easier to get approved, I expect. But the use cases for Math.urandom() are far broader than just UUIDs. As such, it should really go through an independent process where it's subjected to more comprehensive scrutiny. But if it goes down that track, who knows how long that will take.

Lastly, what exactly does "non-blocking source of randomness" mean? Do we mean "async" (only resolves when the system has sufficient entropy to generate secure values), or do we mean "returns immediately, regardless of available entropy"? It's an important difference. For example, the urandom man page has this caveat:

if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack

This raises an issue that's not specific to Math.urandom(): Can we realistically mandate a cryptographic quality source of randomness that is non-blocking and synchronous? Or should we make the API async to allow for cases where the underlying system needs to await sufficient entropy? This isn't just an academic problem, btw - we've seen real-world issues. (Kind of feeling like we need to create a new issue for discussing this, btw. See #31)

Sorry for dumping all this at once. 'Been feeling a little uneasy about this, and this is the first chance I've had to marshall my thoughts on why this is.

bakkot commented 4 years ago

Isn't this pretty much the exact opposite of #25, "Avoid introducing a new source of randomness"?

Yes and no. I think everyone is fine with introducing one new source of randomness as long as that is explicitly what it's for, and is designed such that other APIs which require randomness (like this one) can be built on top of it.

If feasible, implementations should call into existing, public RNG sources in a way that allows users to inject their own RNG implementation. (E.g. by calling WebCrypto crypto.getRandomBytes())

I'm fairly strongly opposed to this sort of "normative optional" behavior. I think it is very important that the specification define precisely the observable behavior of implementations.

do we mean "returns immediately, regardless of available entropy"?

We mean "returns immediately, regardless of available entropy", I am pretty sure, just as crypto.getRandomValues does. Though in practice JavaScript is rarely run early enough in a device's lifecycle that there might not yet be enough entropy.

For example, the urandom man page has this caveat:

if there is not sufficient entropy in the entropy pool, the returned values are theoretically vulnerable to a cryptographic attack

That misleading caveat in the man pages has been annoying cryptographers for years.

The updated man pages have fixed this by removing that caveat and saying explicitly that "The /dev/random interface is considered a legacy interface, and /dev/urandom is preferred and sufficient in all use cases, with the exception of applications which require randomness during early boot time".

bcoe commented 4 years ago

I think we've addressed this topic well with #33 👍 closing this for now, we can reopen in the future if there's a good reason.