rust-random / getrandom

A small cross-platform library for retrieving random data from (operating) system source
Apache License 2.0
264 stars 166 forks source link

Port from libc to rustix #401

Open notgull opened 4 months ago

notgull commented 4 months ago

rustix is a crate which provides I/O-safe wrappers around raw system calls on Linux and libc on other Unixes. By using it we can use raw syscalls instead of libc calls, which has a number of performance, safety and code size advantages.

I am wiling to write a PR for this; however there are a couple of items of discussions we have to go over first.

newpavlov commented 4 months ago

I considered it (personally, I like rustix and use it instead of libc in my projects), but there are several issues:

So we probably will not be migrating to rustix at the very least until hypothetical getrandom v0.3.

notgull commented 4 months ago

For your last point, getrandom_uninit is exposed in rustix now.

For the rest, I agree. We’ll have to wait.

newpavlov commented 4 months ago

getrandom_uninit is exposed in rustix now.

Ah, good to know!

josephlr commented 2 months ago

@notgull is it the intent of rustix to support all target_family = unix Rust targets? We try to support every rust target, and I would be a pain to deal with both rustix and libc crates.

We could maybe switch just the use_file.rs code to use rustix, which would require Linux, Android, Haiku, Redox, NTO and AIX support.

newpavlov commented 2 months ago

I think it makes sense to use rustix only on Linux and Android for the raw syscall feature. For other targets it's essentially a glorified libc wrapper. We probably can limit its use only to the getrandom path and make it a (disabled by default) crate feature.

notgull commented 2 months ago

@notgull is it the intent of rustix to support all target_family = unix Rust targets? We try to support every rust target, and I would be a pain to deal with both rustix and libc crates.

rustix supports all of the Unix targets, as well as some of the other targets, like Windows (through winsock).

We could maybe switch just the use_file.rs code to use rustix, which would require Linux, Android, Haiku, Redox, NTO and AIX support.

Yes, I think this would work.

I think it makes sense to use rustix only on Linux and Android for the raw syscall feature. For other targets it's essentially a glorified libc wrapper. We probably can limit its use only to the getrandom path and make it a (disabled by default) crate feature.

This sounds like a pain to manage, especially when rustix is present enough in the ecosystem that you're probably going to import it anyways.

newpavlov commented 2 months ago

This sounds like a pain to manage, especially when rustix is present enough in the ecosystem that you're probably going to import it anyways.

We are unlikely to use rustix by default and, as noted above, I don't see reasons to use it on non-Linux platforms in our case. So rustix-based code is likely to be behind a feature flag. Enabling the hypothetical linux-raw feature will be no more painful than enabling optional rustix dependency.

briansmith commented 1 month ago

I looked more into rustix:

notgull commented 4 weeks ago

It doesn't matter for us as we don't usegetauxval() for anything, but it means that we need to be careful about

Any particular reason for this?

  • Perhaps we should file a bug (against rustix? and/or against linux-raw-sys?) for them to add it.

There is already a common function for initializing a written-to buffer. So hopefully we can just add the "unpoison" call there.

  • The first part seems pretty reasonable. I am not sure why 1.63 would be a problematic MSRV for getrandom but I do see the concern raised above.

IIRC The main reason is because Debian Stable is considered the "low water mark" for what we reasonably need to support. I'm not sure what the lowest version this crate needs to support it.

It should also be mentioned that the MSRV is bumped to v1.64 if the "std" feature on Rustix is disabled.

josephlr commented 4 weeks ago

@notgull thanks for demoing how an rustix implementation might work (#470). From what I understand, this would still use libc on non aarch64/arm/x86_64/x86/riscv targets. This is understandable, and is related to https://github.com/rust-random/getrandom/issues/424#issuecomment-2153782146 where I noted we couldn't (yet) give up our use of libc::syscall.

Looking at that implementation, I think a lot of the complexity comes from the rustix dependency being optional. If we were going to use rustix, it wouldn't be an optional dependancy; it would entirely replace our use of libc on Linux. As for the use_file implementation, I think a better approach is to use the standard library to provide such an implementation (see #453).

However, I don't think its very likely we would use the rustix backend (at least for now). Given that we would still need to depend on libc for almost all targets (due to libc::syscall), it seems easier for us and our users to depend on libc directly.

notgull commented 4 weeks ago

However, I don't think its very likely we would use the rustix backend (at least for now). Given that we would still need to depend on libc for almost all targets

I've modified #470 to completely remove libc from the Linux backend. It also makes rustix non-optional on Linux and Linux only.

briansmith commented 3 weeks ago

See also https://github.com/bytecodealliance/rustix/blob/3346526711e885611c770ad09764b8ff8d2924ce/build.rs#L22-L27, which gives some ideas about what targets are not supported by Rustix but which are (perhaps accidentally) already supported by getrandom.

briansmith commented 3 weeks ago

As for the use_file implementation, I think a better approach is to use the standard library to provide such an implementation (see https://github.com/rust-random/getrandom/issues/453).As for the use_file implementation, I think a better approach is to use the standard library to provide such an implementation (see https://github.com/rust-random/getrandom/issues/453).

We really should stop forcing users to depend on a pthreads mutex implementation. @notgull did add a vendored implementation from libstd of a futex-based Mutex in their PR, rewritten to use Rustix, which is awsome. OTOH, IDK that getrandom wants to maintain a fork/vendoring of that Mutex.

I think there is tension between getrandom, which wants to be as narrowly-scoped as possible (basically one syscall or equivalent) across every OS, and Rustix, which wants to have a very broad scope (the entire kernel<->userspace API, basically), across a small range of targets.

I think the intersection of those interests that isn't currently being met is in supporting --linux-none and similar targets without a libc. We already have a PR in #461 that addresses that for x86_64-*-linux-none, which is the only such target we know about. That PR is easy to "scale" across target architectures, even ones that rustix doesn't support, because we basically just need to gather all the architecture -> syscall number mappings from the libc crate. OTOH, it seems difficult (for us) to port Rustix to every architecture that Rust supports for Linux.

notgull commented 3 weeks ago

OTOH, IDK that getrandom wants to maintain a fork/vendoring of that Mutex.

That implementation is mostly taken from rustix-futex-sync. Unfortunately that crate has dependencies other than Rustix at the moment, which would make me hesitate to include it here. See sunfishcode/rustix-futex-sync#10.

I think there is tension between getrandom, which wants to be as narrowly-scoped as possible (basically one syscall or equivalent) across every OS, and Rustix, which wants to have a very broad scope (the entire kernel<->userspace API, basically), across a small range of targets.

This tension exists in libc as well, I'd say.

We already have a PR in #461 that addresses that for x86_64-*-linux-none, which is the only such target we know about. That PR is easy to "scale" across target architectures

There's a lot of subtlety in the details when it comes to syscalls, especially across multiple architectures. For instance, there's different conventions on x86 depending on the host CPU, not to mention the whole vDSO thing. It gets tricky once you get down in the weeds.

As mentioned here, it is tricky to get system calls right, especially across architectures. While @rust-random might have the resources to maintain this, other crates will not. It would be a bad precedent to set, especially since getrandom is one of the most downloaded crates and therefore has high visibility.

OTOH, it seems difficult (for us) to port Rustix to every architecture that Rust supports for Linux.

rustix is in use by many different parties, so it's not like @rust-random will be solely on the hook for porting rustix to new architectures. I'm involved in rustix's maintenance as @smol-rs and @rust-windowing depend upon it. Many other parties depend on rustix as well.

newpavlov commented 3 weeks ago

This tension exists in libc as well, I'd say.

libc provides only raw bindings to the functions without any wrapping on top. Effectively, it's just a glorified header file. And even then, on many targets we do our own extern definitions without relying on "platform" crates.

There's a lot of subtlety in the details when it comes to syscalls, especially across multiple architectures.

I wouldn't say it's "a lot". The x86 case is the main stumbling block. And if we are fine with the subpar performance of int 0x80, then even it becomes relatively straightforward.

Also, this is why I strongly argue that raw syscall functions should be exposed by linux-raw-sys or a similar crate. If it will not be done by rustix developers, we can do it either as part of getrandom or in our own crate.

I think a good way forward may be to unconditionally use raw syscalls on Linux for the getrandom syscall, while the fallback code will continue to use libc. By enabling the linux-disable-fallback crate feature getrandom will be libc-free. Technically, the crate itself will still depend on libc, but resulting binary code will not be linked to it.

notgull commented 3 weeks ago

I think a good way forward may be to unconditionally use raw syscalls on Linux for the getrandom syscall, while the fallback code will continue to use libc. By enabling the linux-disable-fallback crate feature getrandom will be libc-free. Technically, the crate itself will still depend on libc, but resulting binary code will not be linked to it.

As I've demonstrated, rustix is usable in this position as well.

I wouldn't say it's "a lot". The x86 case is the main stumbling block. And if we are fine with the subpar performance of int 0x80, then even it becomes relatively straightforward.

My main issue here, as I've stated above, is that this sets an awful precedent. @rust-random may be able to properly maintain syscall bindings, but other, smaller crates won't be able to.

notgull commented 3 weeks ago

That being said, pragmatically I would prefer to have a "raw syscall" backend for getrandom than being locked into libc. So I would be reluctantly willing to investigate a "raw syscall" crate for this reason.