Open joboet opened 2 months ago
Disclaimer: I am one of the getrandom
developers.
I think it's important for RandomSource
methods to properly return potential errors. Getting randomness is an IO operation and it may fail. In some context it's important to process such errors instead of panicking. The error may be either io::Error
or something like getrandom::Error
(i.e. a thin wrapper around NonZeroU32
).
It may be worth to add the following methods to RandomSource
:
fill_bytes
which works with uninitialized buffers, e.g. based on BorrowedBuf
. Yes, zeroization of buffers is usually a very small cost compared to a syscall, but it still goes against the zero-cost spirit.u32
and u64
. Some platforms support direct generation of such values (e.g. RDRAND, WASI, etc.). Going through fill_bytes
will be a bit less efficient in such cases.HashMap
seeding is the most obvious use-case for this.It's also not clear whether it's allowed to overwrite the default RandomSource
supplied with std
similarly to GlobalAlloc
.
Would you consider https://github.com/rust-lang/libs-team/issues/159 to be a better solution? That one used the Read
trait to fulfil everything you mention.
No, I don't think it's an appropriate solution. Firstly, it relies on io::Error
, while IIUC you intend for this API to be available in core
. Secondly, it does not provide methods for generation of u32
and u64
. As I wrote, going through the byte interface is not always efficient. Finally, most of io::Read
methods are not relevant here.
For the last point I guess we could define a separate DefaultInsecureRandomSource
type.
Firstly, it relies on io::Error, while IIUC you intend for this API to be available in core.
I don't think this needs to be a blocker. IMO a lot of std::io
should be moved into core
--not the OS-specific implementations obviously, but all the cross-platform things like type definitions, same as what happened with core::net
.
fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;
Would it be better to make this explicit with generics? Like fn random<R: RandomSource + ?Sized>(source: &mut R) -> Self;
. This gives user the ability to specify the type name when needed.
I think it's important for RandomSource methods to properly return potential errors.
I (mostly) disagree. CSPRNGs should almost never fail. When they do, users are almost never not qualified to diagnose the problem.
For example: https://github.com/golang/go/issues/66821
A compromise is something like this:
trait RandomSource {
type Error;
fn fill_bytes(...) {
self.try_fill_bytes(...).unwrap();
}
fn try_fill_bytes(...) -> Result<..., Self::Error>
}
This allows most CSPRNGs to use Error = Infallible
, but still has support for weird HSMs, etc.
@ericlagergren
I've assumed this trait is for a "system" RNG, which will work together with a #[global_allocator]
-like way to register implementation. I don't think that we need a general RNG trait in std
/core
as I wrote in this comment.
As for design of fallible RNG traits, see the new rand_core
crate.
pub trait Random { fn random(source: &mut (impl RandomSource + ?Sized)) -> Self; }
This trait (and the topic of random value generation) should be removed from this discussion entirely in my opinion, focussing only on "secure random data generation" as in the title. Why: because (1) provision of secure random data is an important topic by itself (with many users only wanting a byte slice and with methods like
from_ne_bytes
already providing safe conversion) and (2) because random value generation is a whole other topic (including uniform-ranged samples and much more).
Disclaimer: I am one of the rand
developers. rand
originally had a similar trait which got removed; the closest surviving equivalent is StandardUniform
.
@newpavlov
It's also not clear whether it's allowed to overwrite the default
RandomSource
supplied withstd
similarly toGlobalAlloc
.
Overriding the default source in an application that already has one from linking std
seems questionable. It's not that I can't imagine any use case for it, but the established pattern for such overrides allows any crate in the dependency graph to do it (it's only an error if you link two such crates), instead of putting the leaf binary/cdylib/staticlib artifact in charge. As you articulated in the context of getrandom
, that's a security risk for applications. So a RandomSource
equivalent should probably be more restrictive in who can override it, but that's not the existing pattern. It also doesn't seem to fit with the proposed generalization of that pattern via "externally implementable functions" (https://github.com/rust-lang/rfcs/pull/3632) -- if that RFC is accepted, any new API surface should use it instead of adding new one-off mechanisms.
If overriding the std
source isn't supported, then it could work the same way as #[panic_handler]
: you must supply an implementation if you don't link std
, but if you do link std
then supplying your own is an error. This would still be extremely useful. Currently, every crate that's (optionally) no_std
and needs some randomness, most commonly for seeding Hasher
s, has to cobble together some sub-par ad-hoc solution to try and get some entropy from somewhere. There's a bunch of partial solutions that are better than nothing (const-random, taking addresses of global/local variables and praying that there's some ASLR, a global counter when atomics are available, cfg
-gated access to target-specific sources like CPU cycle counters or x86 RDRAND
) but:
wasm32-unknown-unknown
and wasm32v1-none
targets .This wouldn't be a problem if the entire ecosystem could agree to always delegate this problem to on one specific crate (version) with appropriate hooks, like getrandom
, but evidently that's not happening. Putting this capability into core
(or a new no_std
sysroot crate, comparable to alloc
) has a better chance of solving this coordination problem. Well, at least eventually, once everyone's MSRV has caught up.
Edit: almost forgot that even std::collections::Hash{Map,Set}
depend on having a source of random seeds. A way to supply such a source without linking std
could help with moving those types to alloc
, although as https://github.com/rust-lang/rust/issues/27242#issuecomment-1549577935 points out, it's not backwards compatible to make such a source mandatory for no_std
+ alloc
applications.
@hanna-kruppe
Overriding the default source in an application that already has one from linking std seems questionable.
There is a number of reasons to allow overriding: 1) An alternative interface may be more efficient than the default one (e.g. reading the RNDR register vs doing syscall) 2) It may help reduce binary size and eliminate potentially problematic fallback paths (e.g. if you know that you do not need the file fallback on Linux) 3) In some cases it's useful to eliminate non-deterministic inputs (testing, fuzzing)
So a RandomSource equivalent should probably be more restrictive in who can override it, but that's not the existing pattern.
Yes. How about following the getrandom
path and allow override only when a special configuration flag is passed to the compiler?
Either way, overriding is probably can be left for later. I think we both agree that we need a way to expose "system" entropy source in std
and a way to define this source for std
-less targets.
It also doesn't seem to fit with the proposed generalization of that pattern via "externally implementable functions" -- if that RFC is accepted, any new API surface should use it instead of adding new one-off mechanisms.
I agree that ideally we need a unified approach for this kind of problem. I made a similar proposal once upon a time.
But I think it fits fine? Targets with std
could implicitly use std_random_impl
crate for "external implementation" of the getrandom
-like functions and users will be able to override it in application crates if necessary.
How well this works ends up highly platform-specific, in particular none of them work well for wasm32-unknown-unknown and wasm32v1-none targets .
I believe that having std
for wasm32-unknown-unknown
was a big mistake in the first place and the wasm32v1-none
target is a good step in the direction of amending it. So I hope we will not give too much attention to its special circumstances.
This wouldn't be a problem if the entire ecosystem could agree to always delegate this problem to on one specific crate (version) with appropriate hooks, like getrandom, but evidently that's not happening.
Well, it has happened, sort of. getrandom
is reasonably popular in the ecosystem even after excluding rand
users.
The problem is that std
already effectively includes its variant of getrandom
for HashMap
seeding and people reasonably want to get access to that. And I think problem of getting "system" entropy is fundamental enough for having it in std
(well, not in the std
per se, let's say in the sysroot crate set).
A way to supply such a source without linking std could help with moving those types to alloc, although as https://github.com/rust-lang/rust/issues/27242#issuecomment-1549577935 points out, it's not backwards compatible to make such a source mandatory for no_std + alloc applications.
Can we add yet another sysroot crate for HashMap
which will depend on both alloc
and the hypothetical "system entropy" crate?
But I think it fits fine? Targets with
std
could implicitly usestd_random_impl
crate for "external implementation" of thegetrandom
-like functions and users will be able to override it in application crates if necessary.
The RFC (and the competing ones I've looked at) only supports a default implementation in the crate that "declares" the externally-implementable thing. If that crate isn't std
, then an implementation from std
would not count as "default" but conflict with any other definition. So we'd need another special carve-out for std
(the very thing we'd want to avoid by adding a general language feature), or the language feature needs to become much more general to support overrideable default implementations from another source.
I believe that having
std
forwasm32-unknown-unknown
was a big mistake in the first place and thewasm32v1-none
target is a good step in the direction of amending it. So I hope we will not give too much attention to its special circumstances.
I was specifically talking about no_std
libraries, for which the two targets are basically equivalent. Both don't have any source of entropy implied by the target tuple (instruction set, OS, env, etc.), and if you want to add one it'll have to involve whatever application-specific interface the wasm module has with its host.
Well, it has happened, sort of.
getrandom
is reasonably popular in the ecosystem even after excludingrand
users.
Not to point any fingers but a counter example that's fresh on my mind because I looked at its code recently is foldhash. As another example, ahash
only uses getrandom
optionally (though it's on by default). If you're only using ahash
indirectly through another library that disables the feature, then it's not gonna use getrandom
unless you happen to notice this and add a direct dependency to enable the feature. In that case there is a solution, at least, but it's still not discoverable.
Can we add yet another sysroot crate for
HashMap
which will depend on bothalloc
and the hypothetical "system entropy" crate?
Possibly, but people may object to a proliferation of sysroot crates so let's hope there's a better solution.
or the language feature needs to become much more general to support overrideable default implementations from another source.
Yes, it's not as if RFC is a technical specification which must be followed word-by-word. There is a number of cases where the original RFC vision has somewhat changed during implementation stages. If anything, I would say it's an oversight/deficiency of the RFC to not cover cases like this.
foldhash
If a crate aims to minimize its number of dependencies as far as possible even at the cost of code quality and security, it obviously will not depend on getrandom
, despite it being the de facto standard for getting system entropy. I think most people will agree that hacks like this have a strong smell. The same applies to fastrand
(amusingly, it still uses getrandom
for Web WASM) and other "partial solutions" listed by you. As you can notice, both hack their way into using std
to get system entropy and either use a fixed seed or pile even more hacks when std
is not available.
As I said, I have no intention of pointing fingers at any crates. They have to navigate tricky trade-offs and complexities due to Rust's standard library (as a whole, not just std
) not yet providing any entropy access. Let's keep this issue focused on changing that.
Feature gate:
#![feature(random)]
This is a tracking issue for secure random data generation support in
std
.Central to this feature are the
Random
andRandomSource
traits insidecore::random
. TheRandom
trait defines a method to create a new random value of the implementing type from random bytes generated by aRandomSource
.std
also exposes the platform's secure random number generator via theDefaultRandomSource
type which can be conveniently access via therandom::random
function.Public API
Steps / History
Unresolved Questions
gen_bytes
andDefaultRng
, the implementation PR usesfill_bytes
andDefaultRandomSource
(see arguments progen_bytes
and profill_bytes
)