Should `generate_token` not be using `secrets.SystemRandom` instead of `random.SystemRandom`?

amn commented 10 months ago

Looking at https://github.com/lepture/authlib/blob/a6e89f8e6cf6f6bebd63dcdc2665b7d22cf0fde3/authlib/common/security.py#L9, it's apparent random.SystemRandom is used with generation of values of security-sensitive nature. random.SystemRandom is a pseudorandom number generator which must not be used for things like generation of authorization tokens, in the very least. Even the documentation on the random module says as much:

Warning: The pseudo-random generators of this module should not be used for security purposes. For security or cryptographic uses, see the secrets module.

Instead and just like the Python documentation suggests, generate_token at least, should use secrets.SystemRandom which is a so-called cryptographically secure RNG and as such is fit for purpose of token generation.

Otherwise, the tokens (and nonces which are also generated by generate_token) are forgeable.

bjmc commented 10 months ago

random.SystemRandom() is not a pseudorandom generator. The "pseudo-random generators" in question refers to the Mersenne Twister used by default if you call, e.g. random.randint()

In contrast, random.SystemRandom() is a wrapper around os.urandom() that should be a cryptographically-secure source of entropy provided by the operating system.

os.urandom(size, /)

Return a bytestring of size random bytes suitable for cryptographic use.

The short answer to the title question is, "No": what this library is doing is secure. Maybe there's a case to be made that it would be more obvious to call secrets.SystemRandom() than random.SystemRandom() but the secrets module just imports random.SystemRandom() the underlying implementation is identical.

amn commented 10 months ago

~~I was indeed wrong about random.SystemRandom being PRNG, my bad.~~ /dev/urandom is a bit of an oddity in that it is a PRNG, seeded with true random data, quoting man urandom:

When read, the /dev/urandom device returns random bytes using a pseudorandom number generator seeded from the entropy pool.

It's debatable, of course, whether the quality of randomness it produces is to be deemed sufficient for authlib, and it's another debate altogether, I would say.

To come back to this issue, regardless of quality of /dev/urandom in this context, having been well aware that secrets.SystemRandom uses random.SystemRandom under the hood currently, I still think that's long-term a dangerous assumption to rely on, also for the reason that [importing of] modules like random or secrets should communicate intent. Either one of these is free, as part of developing Python, to change how it works under the hood, and which modules it imports in turn, if at all, to continue abiding by the fundamental software contract documented for it and the service it advertises through objects its exports. Is authlib effectively deferring the issue of the module it imports, to when the aforementioned assumptions start breaking? Is it not better to change the code now, at least for readability's sake but also to avoid the issue in the future altogether? And is it not slightly alarming that despite the clear discouragement by documentation of random, authlib, a popular offering dealing in security, chooses to rely on a mechanism which does not, strictly speaking, offer the software contract authlib should demand?

To try and put it plainly, random.SystemRandom does not have to effectively be CSRNG and is free to cease to be one, while authlib by nature should require a CSRNG, which secrets.SystemRandom is bound to offer.

bjmc commented 10 months ago

The security of /dev/urandom is almost certainly beyond the scope of this library.

As to the question of secrets.SystemRandom vs random.SystemRandom I think it's purely a matter of opinion. The secrets module has existed only since Python 3.6, so to me, as a question of readability random.SystemRandom is more familiar and obvious, but I don't feel that strongly about it, and others might disagree.

I think your concerns about possible future changes to the Python standard library are probably unfounded. The Python authors are surely well aware that random.SystemRandom() is used for cryptographic applications, and they would not change it to a less-secure implementation.

lepture commented 10 months ago

I'll close this issue for now, as secrects.SystemRandom is the same with random.SystemRandom. However, we may use secrets.token_urlsafe directly later in state generator or other things.

lepture / authlib

Should `generate_token` not be using `secrets.SystemRandom` instead of `random.SystemRandom`? #588