sunfishcode / origin

Program startup and thread support written in Rust
Other
166 stars 12 forks source link

Why is origin relocation so unsafe? #26

Open morr0ne opened 1 year ago

morr0ne commented 1 year ago

Following up on #23, it's still not clear to me exactly why relocation is so hard to implement safely. When building a normal rust binary linking to libc I never had to worry about such problems. The conclusion to draw here is obviously that libc is doing some magic behind the scene to relocate when necessary, unless that is itself has questionable safety? Does rust just assume that libc implementation is good enough? Does the relocation come from somewhere else? Couldn't origin "port" an existing proven implementation? And even then how much of that is the job of the compiler vs the dynamic linker vs libc/crt0? Perhaps I'm asking to many question but I'd love to get a better understanding on what's happening on the lower level and the safety implications of that. Is it even possible to make such code safe? Is rust just blindly trusting that the code it's linking is safe? Maybe such question are much bigger that origin itself but I can't really think of another case like this one for something as modern as rust. I am not sure if questions like this should be asked somewhere else but searching elsewhere always leads to ancient C code (or at least compared to rust) that essentially doesn't worry about any safety and just trusts the compiler and standard library implementations.

On a final note, this is probably way to long of an issue and it doesn't even feel as such. I think this would be a better fit for github discussions or perhaps some other place like matrix or discord to allow for easier discussions

sunfishcode commented 1 year ago

Yes, I believe libc implementations are doing roughly the same thing that origin's new experimental-relocate code is. So I think there are a few things going on here.

One is that the existing popular libc implementations have some of the most widely-used software in the world for many years now, and origin is new and not at all widely used. In comparison, This code in origin is some code I just wrote and so far I have no reason to believe anyone other than me and github actions has tried to run it, and I only ran a few simple testcases. And, relocation processing involves a decent chunk of raw-pointer code, and it unavoidably makes some assumptions about data structures provided by the OS and the ELF producer toolchain that are difficult to comprehensively validate. So odds are, it has bugs. Perhaps over time if more people use and/or look at this code, we could gain some level of confidence.

Another consideration is that libc authors will sometimes rely on the fact that they (or distro maintainers) ship precompiled versions of their libraries, in which case they control what compiler it's compiled with and what options are used, and the precompiled form acts as an optimization boundary. But in Rust, origin is always distributed as source and could get pulled in as a dependency and compiled by arbitrary future Rust versions with arbitrary options for an arbitrary architecture. It could even get LTO enabled, eliminating the optimization boundary. Users might even enable profiling or other interesting debugging instrumentation that injects calls, which may need to be relocated. We have very little insight into the full set of possible things an optimizer might do or code the compiler might generate.

One of my goals for origin is to really avoid the temptation to say "this is Low-Level Code(tm)". I want to write it in rule-abiding Rust all the way down. But, this relocation code breaks rules. It mutates memory that Rust thinks is immutable. It's running in a kind of incompletely-compiled state, where in a sense it's not really even compiled Rust code yet.

So I'm not comfortable with it yet. But it does work in simple testcases, and it seems reasonable to put it out there as an option for anyone who wants to use it, and maybe over time my understanding of it will evolve.

(I use issue-tracker threads for discussions like this all the time and am entirely comfortable with it. Really long threads do get unwieldy sometimes, however long multi-pronged github discussion threads can get unwieldy too. But if you'd be more comfortable with github discussions, I'd be willing to try them here.)

morr0ne commented 1 year ago

One is that the existing popular libc implementations have some of the most widely-used software in the world for many years now, and origin is new and not at all widely used. In comparison, This code in origin is some code I just wrote and so far I have no reason to believe anyone other than me and github actions has tried to run it, and I only ran a few simple testcases. And, relocation processing involves a decent chunk of raw-pointer code, and it unavoidably makes some assumptions about data structures provided by the OS and the ELF producer toolchain that are difficult to comprehensively validate. So odds are, it has bugs. Perhaps over time if more people use and/or look at this code, we could gain some level of confidence.

I guess if this code gained more traction it might actually be validated further, right now I think this is rather niche application because most user would just assume to have a working dynamic linker and not worry about dynamic relocations. What can be done however is adding as much test cases as possible and some kind of fuzzing. Ideally we could adapt test cases from other implementations which I assume is the case for popular libc implementations. Maybe looking at how redox-os deals with relocations in relibc might prove useful.

Another consideration is that libc authors will sometimes rely on the fact that they (or distro maintainers) ship precompiled versions of their libraries, in which case they control what compiler it's compiled with and what options are used, and the precompiled form acts as an optimization boundary. But in Rust, origin is always distributed as source and could get pulled in as a dependency and compiled by arbitrary future Rust versions with arbitrary options for an arbitrary architecture. It could even get LTO enabled, eliminating the optimization boundary. Users might even enable profiling or other interesting debugging instrumentation that injects calls, which may need to be relocated. We have very little insight into the full set of possible things an optimizer might do or code the compiler might generate.

I think the best way to deal with this all this cases is to check how other project handle this sort of stuff and test against it. Maybe it might be possible to instruct rust to avoid certain optimizations? If the solution turns out to be shipping a precompiled version it might be worth looking into that. One reason I'd like this to be resolved has been experimenting with a rust only target which would need some kind of replacement for the startup normally handled by libc, origin seem like the only project, or at least the only one I am aware of, that would fit the bill. To reach feature parity, dynamic relocation is pretty much required, especially in certain environment where only pie executable are allowed for security reasons

One of my goals for origin is to really avoid the temptation to say "this is Low-Level Code(tm)". I want to write it in rule-abiding Rust all the way down. But, this relocation code breaks rules. It mutates memory that Rust thinks is immutable. It's running in a kind of incompletely-compiled state, where in a sense it's not really even compiled Rust code yet.

I am not sure that is an achievable code, at least not with that interpretation of "Low-Level Code". Technically speaking this is the lowest possible code, there's essentially nothing below this beside rustix which in most cases maps directly to Linux syscalls. Maybe the better solution is to have the relocation reside in an outside crate that can be depended upon? After all if the goal is to ditch libc we do need a dynamic linker written in rust even if it resides outside of the executable itself.

So I'm not comfortable with it yet. But it does work in simple testcases, and it seems reasonable to put it out there as an option for anyone who wants to use it, and maybe over time my understanding of it will evolve.

That is entirely fair, in fact I am extremely glad you decide to dedicate your time on such endeavor, considering that so far most of what I requested would essentially only apply to a pet project of mine

(I use issue-tracker threads for discussions like this all the time and am entirely comfortable with it. Really long threads do get unwieldy sometimes, however long multi-pronged github discussion threads can get unwieldy too. But if you'd be more comfortable with github discussions, I'd be willing to try them here.)

I am quite happy with github issues, especially since they automatically transform in a easy to digest email chain. I proposed another mean of communications simply because I felt that perhaps the issues I opened recently don't really fit in "bug tracker"-like format but are more of an open ended discussion about software implementations.

sunfishcode commented 1 year ago

I'm a little unclear about what your concern here is. The code is there now. If you try it out, please report back on how it went. If anyone wants to write tests or fuzzers or read the code or anything else, I'd welcome the help.

Maybe it might be possible to instruct rust to avoid certain optimizations? If the solution turns out to be shipping a precompiled version it might be worth looking into that.

One of my goals for this project is to avoid doing either of these things, because they mean it's not a program written in a language anymore, but just a collection of relatively maintainable sequence of bytes that we're reasonably sure can be turned into functioning executables.

I am not sure that is an achievable code, at least not with that interpretation of "Low-Level Code".

Except for this new relocation option, to the best of my knoweldge, everything else in origin and rustix does follow the rules, all the way down to the asms, and the asms are minimal.

Maybe the better solution is to have the relocation reside in an outside crate that can be depended upon?

A crate boundary just shifts the problem somewhere else.

I'm actually starting to wonder if perhaps the best longer-term approach is to write the entire relocate sequence in assembly, that we could run right from the _start entrypoint, so it'd be done before any Rust code runs. It might be a few hundred lines of code or so, per architecture, so it's not something to be done lightly. But, it would let us say that all the Rust code is just plain Rust code.

morr0ne commented 1 year ago

I'm a little unclear about what your concern here is. The code is there now. If you try it out, please report back on how it went. If anyone wants to write tests or fuzzers or read the code or anything else, I'd welcome the help.

I am gonna take a look how other project implemented such things and try to come up with some kind of fuzzing. Also I'm kind of in a weird spot where I don't think I'm triggering any relocation so I would need to do so on purpose. For context I'm experimenting with a very simple init implementation, and so far I'm essentially only checking for pid 1 and trying to create some safe wrapper around fork and to reap zombie process.

One of my goals for this project is to avoid doing either of these things, because they mean it's not a program written in a language anymore, but just a collection of relatively maintainable sequence of bytes that we're reasonably sure can be turned into functioning executables.

I guess that is true, but the only way around is probably to just use assembly code

A crate boundary just shifts the problem somewhere else.

That's kinda what I meant, in certain sense this is so experimental that it might benefit to be more isolated than the rest of the code.

I'm actually starting to wonder if perhaps the best longer-term approach is to write the entire relocate sequence in assembly, that we could run right from the _start entrypoint, so it'd be done before any Rust code runs. It might be a few hundred lines of code or so, per architecture, so it's not something to be done lightly. But, it would let us say that all the Rust code is just plain Rust code.

I'm assuming that technically possible but unfortunately I only have basic knowledge of assembly so I don't think I could provide any input there, to be honest I'm already having an hard time wrapping my head around the current implementation.

sunfishcode commented 1 year ago

I think https://github.com/sunfishcode/origin/pull/75 is a step toward fixing this. It's not necessary to write the entire relocate function in asm; we can do just the loads and stores that don't follow the Rust memory model in inline asm.