bytecodealliance / rustix

Safe Rust bindings to POSIX-ish APIs
Other
1.5k stars 162 forks source link

Move sycall functions and macro from `linux_raw/arch` to the `linux-raw-sys` crate? #1055

Open newpavlov opened 7 months ago

newpavlov commented 7 months ago

In some cases it would be nice to be able to call syscalls directly using the linux-raw-sys crate without relying on wrappers in rustix.

sunfishcode commented 7 months ago

I've so far resisted this idea on the theory that for any syscall you'd use this with, it'd be better to add a wrapper to rustix and use that.

There are a lot of syscalls with non-trivial considerations. Some syscalls have y2038 bugs or other anachronisms. Some syscalls, like rt_sigaction, seem like they should be usable from regular Rust code, but break assumptions made by popular libc implementations. And some syscalls, like vfork maybe can't be used by Rust code at all. Part of what rustix is doing is adding a level of diligence around identifying safety invariants and conflicts with Rust semantics, std assumptions, and libc assumptions that arise with many Linux syscalls.

And, while it's more work to add a wrapper to rustix just to be able invoke the one syscall you need in a given situation, it helps the ecosystem when it turns out to be multiple people who want to invoke the same syscall.

That said, I'm open to discussing other approaches.

newpavlov commented 7 months ago

There are a lot of syscalls with non-trivial considerations.

You are right, but all this stuff is "handled" by marking the syscall functions unsafe. It's worth to direct users towards using rustix instead of raw syscalls, but it does not mean that it's not worth to expose the raw syscall functions, similarly to how we often have "raw" *-sys crates and safe wrapper crates on top of them.

The main reason why I would like to have the raw syscall functions is for adding "raw syscall" backend to the getrandom crate (see https://github.com/rust-random/getrandom/issues/401). We could use getrandom_uninit exposed by rustix, but rustix is a pretty sizable dependency which pulls several other dependencies, pulling it for 1 simple syscall looks a bit excessive.

Exposing syscall functions can also be useful for other people who would like to experiment with unsafe low-level stuff or with different approaches to wrapping the Linux syscall API.

sunfishcode commented 6 months ago

If rustix is seen as excessive, then it sounds to me like we just shouldn't do https://github.com/rust-random/getrandom/issues/401 at this time. Using a raw syscall API in getrandom would be less safe than just using libc.

I'm not eager to be a maintainer of a public general-purpose raw-syscall API. It's not just unsafe, it's extra-unsafe. It's implausible to even imagine a # Safety comment for it, because it can do so many different things.

If Rust wants to work towards using raw syscalls for more things in general, my preference would be to focus on improving rustix, such as by adding more cfgs to further reduce its compile time, rather than spreading bits of raw syscall knowledge into general-purpose crates around the ecosystem.

If people want to experiment with alternate ways of wrapping the Linux syscall API, that's great. I don't expect that forking the code in rustix (if that's what they want) would be the hard part of such experiments. If people come up with something that works well enough that they want to use it in practice, then great, let's talk about how to migrate things off of rustix, or how best to factor rustix to best share code, or whatever the situation calls for.

newpavlov commented 6 months ago

Using a raw syscall API in getrandom would be less safe than just using libc.

Assuming the syscall functions are implemented correctly, how so? We already use libc::syscall in the module used by default on Linux (we do it because libc::getrandom may not be available on older versions of glibc).

I'm not eager to be a maintainer of a public general-purpose raw-syscall API. It's not just unsafe, it's extra-unsafe. It's implausible to even imagine a # Safety comment for it, because it can do so many different things.

But this fully applies to libc::syscall as well. And no one argues it should be removed from the libc crate.

Feel free to close this issue and the linux-raw-sys PR, though I think it's a real shame that we have no practical choice but to use libc for raw syscalls, rendering many use cases for raw syscalls immediately meaningless...

sunfishcode commented 6 months ago

I didn't realize getrandom was already using libc::syscall. That's an unfortunate situation, though I can see the practical reasons for it now.

The way I look at it, if this was just about getrandom, maybe it'd be ok (ignoring the 32-bit x86 issue for the moment). But if every crate that just needs 1 or 3 or 5 syscalls does its own raw syscalls, then we end up in an ecosystem in which a set of concerns that previously almost no one in the Rust universe needed to know about before become spread out around many different crates and harder to audit. And we get an ecosystem where it's harder for users to opt out of raw syscalls and use libc for everything if one wishes to (and there are several reasons people do this in practice). And it's harder to port Rust code to new CPU architectures (getrandom may not be affected by this, but other syscalls are).

That's not an ecosystem I want to design for.

newpavlov commented 6 months ago

It's not only about getrandom. I mentioned it in the getrandom issue, but the io-uring crate also does it's own raw syscalls. And its solution is... subpar. I will not be surprised if we can find a bunch of other examples in the ecosystem.

I am sympathetic with your concerns, but maybe let's leave this decision to crate developers? If raw syscalls will not be exposed by linux-raw-sys, developers either will use alternative crates like sc (like io-uring), or will roll out their own inline asms. I think this will be a worse situation to be in, than the one described by you.

Hopefully, with development of *-linux-none targets, the need for separate "raw syscall" crate features will diminish with time.

sunfishcode commented 6 months ago

I guess, I had imagined the x86_64-unknown-linux-none conversation going a different direction. In the original PR creating the x86_64-unknown-linux-none target, @morr0ne talked about using rustix for most or even all I/O, such that most applications would already have rustix in their dependency tree, such that the cost (compile time, auditing) of each new crate that starts using it would be low. I imagined we'd be doing this just for linux-none, and not changing any other target (unless it makes sense to anyway). I think a lot of the topics being discussed make more sense when that's the big picture.

newpavlov commented 6 months ago

In the original PR creating the x86_64-unknown-linux-none target, @morr0ne talked about using rustix for most or even all I/O, such that most applications would already have rustix in their dependency tree

I hope that eventually the none targets will have a proper std. It may be implemented using rustix, but rustix will not be explicitly in dependency tree for most users. In this picture, you would use rustix sparingly, to do specific low-stuff which is not exposed by std.

I imagined we'd be doing this just for linux-none, and not changing any other target

In some rare cases it makes sense to use raw syscall on non-none targets. For example, with the aforementioned io-uring crate, most of application's work is done by 3 syscalls, but you may want to retain ability to link to shared C libraries.

To reiterate my point: you either trust developers to use sharp tools responsibly (note that rustix itself is not the dullest tool), or you try to promote your vision using crates under your control. I understand, if you choose the latter, but then expect developers to act accordingly in response.

joshtriplett commented 5 months ago

@newpavlov wrote:

We could use getrandom_uninit exposed by rustix, but rustix is a pretty sizable dependency which pulls several other dependencies, pulling it for 1 simple syscall looks a bit excessive.

Would it help if you had feature flags for rustix that minimize how much of it you need to pull in to make this call?

newpavlov commented 5 months ago

I don't think it's possible to do it in a reasonable way, you effectively would need to cut almost everything. And even after that rustix still will be a big dependency, which has to be properly reviewed.

notgull commented 5 months ago

I think a raw syscall crate would be reasonable, under the following conditions:

newpavlov commented 5 months ago

There is some static analysis component.

I don't think it should be done in the same crate which defines raw syscall functions/macros. But it could be a good feature for linux-raw-sys or for a crate which depends on both linux-raw-sys and the raw syscall. Also, defining wrapper functions for each known syscall with "proper calling conventions" is very close to what is already done by rustix.