rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.47k stars 12.3k forks source link

Tracking Issue for strict_provenance #95228

Open Gankra opened 2 years ago

Gankra commented 2 years ago

Feature gate: #![feature(strict_provenance)]

read the docs

get the stable polyfill

subtasks

This is a tracking issue for the strict_provenance feature. This is a standard library feature that governs the following APIs:

IMPORTANT: This is purely a set of library APIs to make your code more clear/reliable, so that we can better understand what Rust code is actually trying to do and what it actually needs help with. It is overwhelmingly framed as a memory model because we are doing a bit of Roleplay here. We are roleplaying that this is a real memory model and seeing what code doesn't conform to it already. Then we are seeing how trivial it is to make that code "conform".

This cannot and will not "break your code" because the lang and compiler teams are wholy uninvolved with this. Your code cannot be "run under strict provenance" because there isn't a compiler flag for "enabling" it. Although it would be nice to have a lint to make it easier to quickly migrate code that wants to play along.

This is an unofficial experiment to see How Bad it would be if Rust had extremely strict pointer provenance rules that require you to always dynamically preserve provenance information. Which is to say if you ever want to treat something as a Real Pointer that can be Offset and Dereferenced, there must be an unbroken chain of custody from that pointer to the original allocation you are trying to access using only pointer->pointer operations. If at any point you turn a pointer into an integer, that integer cannot be turned back into a pointer. This includes usize as ptr, transmute, type punning with raw pointer reads/writes, whatever. Just assume the memory "knows" it contains a pointer and that writing to it as a non-pointer makes it forget (because this is quite literally true on CHERI and miri, which are immediate beneficiaries of doing this).

A secondary goal of this project is to try to disambiguate the many meanings of ptr as usize, in the hopes that it might make it plausible/tolerable to allow usize to be redefined to be an address-sized integer instead of a pointer-sized integer. This would allow for Rust to more natively support platforms where sizeof(size_t) < sizeof(intptr_t), and effectively redefine usize from intptr_t to size_t/ptrdiff_t/ptraddr_t (it would still generally conflate those concepts, absent a motivation to do otherwise). To the best of my knowledge this would not have a practical effect on any currently supported platforms, and just allow for more platforms to be supported (certainly true for our tier 1 platforms).

A tertiary goal of this project is to more clearly answer the question "hey what's the deal with Rust on architectures that are pretty harvard-y like AVR and WASM (platforms which treat function pointers and data pointers non-uniformly)". There is... weirdness in the language because it's difficult to talk about "some" function pointer generically/opaquely and that encourages you to turn them into data pointers and then maybe that does Wrong Things.

The mission statement of this experiment is: assume it will and must work, try to make code conform to it, smash face-first into really nasty problems that need special consideration, and try to actually figure out how to handle those situations. We want the evil shit you do with pointers to work but the current situation leads to incredibly broken results, so something has to give.

Public API

This design is roughly based on the article Rust's Unsafe Pointer Types Need An Overhaul, which is itself based on the APIs that CHERI exposes for dynamically maintaining provenance information even under Fun Bit Tricks.

The core piece that makes this at all plausible is pointer::with_addr(self, usize) -> Self which dynamically re-establishes the provenance chain of custody. Everything else introduced is sugar or alternatives to as casts that better express intent.

More APIs may be introduced as we explore the feature space.

// core::ptr
pub fn invalid<T>(addr: usize) -> *const T;
pub fn invalid_mut<T>(addr: usize) -> *mut T;

// core::pointer
pub fn addr(self) -> usize;
pub fn with_addr(self, addr: usize) -> Self;
pub fn map_addr(self, f: impl FnOnce(usize) -> usize) -> Self;

Steps / History

Unresolved Questions

fu5ha commented 2 years ago

(I'd love an answer to the latter question as well that's more satisfying than "here's this weird C example involving pointer comparisons and out-of-bounds accesses that doesn't seem comparable to anything real-world code should ever actually do", which is how the int y, x, *p = &x + 1 example always seems to me.)

I think that's basically this post, which is linked in the OP, though admittedly not prominently https://www.ralfj.de/blog/2020/12/14/provenance.html

pcwalton commented 2 years ago

Well, to be honest, the problem is that pointer provenance bugs are empirically very rare. Which is why it hasn't been properly solved in C and C++ to begin with.

My worry is very practical: we have a lot (I'm not permitted to say how much, but a lot) of unsafe Rust code that does not conform to Stacked Borrows, much less to strict provenance, that we are not going to be able to rewrite. I can't imagine that we are the only ones in that situation. Regardless of what the 1.0 compatibility promise says, if you break unsafe code too much you are effectively creating a Rust 2.

RalfJung commented 2 years ago

(I'd love an answer to the latter question as well that's more satisfying than "here's this weird C example involving pointer comparisons and out-of-bounds accesses that doesn't seem comparable to anything real-world code should ever actually do", which is how the int y, x, *p = &x + 1 example always seems to me.)

Part of the problem is that real-world code does a lot of the pieces of this (one-past-the-end pointers are fairly common), and if we want a compiler that's always correct (modulo bugs) we can't just discount such counterexamples merely on the basis that real-world code doesn't look like this. Real-world streets also don't look like the Moose test and yet we better make sure our everyday cars pass it.

I wish I knew how many of the weird segfaults in C/C++ where people just shrug and move on, doing random code mutation until it works, are caused by issues like this... if a provenance bug affects real-world code, I have serious doubts anyone would be able to tell. Compiler devs would just assume the code has UB, and developers would assume the compiler has a bug, and everyone moves on.

pcwalton commented 2 years ago

I guess what I'm arguing here is that a lot of thought needs to be given to how this can become opt-in, because it seems to me that strict provenance (and probably stacked borrows) needs to be opt-in, with full compatibility with opt-out code. Much like strict-aliasing effectively is in C. If this is a purely opt-in semantics, then I don't have any objections to this work.

Gankra commented 2 years ago

Patrick, as someone who actually teaches the language to people I know for a fact that MANY "normal" programmers are constantly frustrated with unsafe Rust code because it is literally impossible for us to tell them how to write it correctly. I tried! I wrote a book on it! Two books! And documented the shit out of the standard library! And wrote tons of articles explaining the design of the language! And people are still struggling to deal with unsafe!

The raison-d'etre of Rust is that it helps you write correct and safe code. The ability to drop down to unsafe code and extend the language is a HUGE part of that! But if you go into it with a desire to write correct code, you know, the core mindset of Rust... we just send you a shrug emoji and chill vibes.

To be blunt, the design of Rust 1.0 completely ignored unsafe Rust and left it in a complete disaster state. I have repeatedly tried to make it more ergonomic and coherent at the language or compiler level and every time I do I have gotten rebuffed with "no we should do something magic instead" and then that never materializes because the lang team has no interest in unsafe rust.

So honestly I am just deeply frustrated by of course, a former lang and compiler dev responding with "no we should just do something magic instead" even when I am actively avoiding specifying any language or compiler changes. I am literally just adding new stdlib APIs that make it easier to write unsafe rust code that expresses intent and helps ensure your code is obviously correct. Literally! Optional APIs! That are unstable! With tons of documentation explaining how they help! And I am still! Being told! To fuck off! And wait for magic! That you will not work on!

fu5ha commented 2 years ago

I tried! I wrote a book on it! Two books! And documented the shit out of the standard library! And wrote tons of articles explaining the design of the language! And people are still struggling to deal with unsafe!

The raison-d'etre of Rust is that it helps you write correct and safe code. The ability to drop down to unsafe code and extend the language is a HUGE part of that! But if you go into it with a desire to write correct code, you know, the core mindset of Rust... we just send you a shrug emoji and chill vibes.

Just want to echo this. I consider myself a fairly competent Rust programmer, and until recently have used unsafe moderately competently at a surface level. As I got into the depths of ""real"" unsafe code recently with the intent of verifying and testing safety assumptions in our biggest Rust project at Embark, I've run into several of the roadblocks described above, and a fair amount of the confusion is root-caused by the stuff this initiative is trying to address. As of now,

programmers are constantly frustrated with unsafe Rust code because it is literally impossible for us to tell them how to write it correctly

could not be more true, and I echo the feelings of the disembodied 'programmer'. This is quite sad I think, and it's absolutely worth the effort to try to improve this. I now write unsafe code with the resigned feeling that the code I'm writing is a best-effort works, but is very much not "correct" to the same standard that we try to write Rust code in general with.

RalfJung commented 2 years ago

Making it opt-in basically means having two Rust dialects, and having a subset of libraries not compatible with one of the libraries. That's an ecosystem split. It sounds pretty terrible...

That said, I could totally imagine a compromise where code that violates whatever provenance ends up being is treated as technically non-conformant, but there is still some best-effort attempt to keep that code working -- in a dialect between programmer and compiler devs, possibly adjusting the code or the compiler or both. No formal spec will cover it and it'll never be certified, but we don't have to treat all UB equal, and we don't have to close all reports of "code with UB misbehaves" as wont-fix; we can try to work together to find solutions that work for everyone. EDIT: Actually I think a better way to describe the situation is what I wrote here.

I for once would be very curious if you think that your large codebase could have been written to be conformant with strict provenance if the developers would have been aware of it at the time. IOW, is the issue just one of legacy code or is there an expressivity gap?

pcwalton commented 2 years ago

I'm not telling anyone to go away and wait for magic. I'm just saying that you can't break the tons of unsafe code that exists in the wild. There's plenty of empirical evidence that almost nobody is writing unsafe code properly. Our choices are (1) do a Rust 2.0 to fix it; (2) opt-in changes to address things and messy fixes that are not what we would like. There is no in-between. Breaking changes are effectively a Rust 2.0.

The lead-up to Rust 1.0 was fine and wasn't a complete disaster. Not any more than any other software 1.0 project is, anyway. There were things that had to be sacrificed; a complete and coherent memory model was one of them. Delaying Rust 1.0 to the invention of CHERI wasn't realistic.

RalfJung commented 2 years ago

There's plenty of empirical evidence that almost nobody is writing unsafe code properly.

Dunno, I have to say overall Miri results are pretty encouraging to me. I think Rust is in a much better spot than C/C++ to actually tackle this issue, and we should use that chance rather than throw up our hands and give up. Sure, mistakes were made, but if we work together we can find ways to overcome them.

Computers don't have to be terrible, if we don't make them terrible.

pcwalton commented 2 years ago

Making it opt-in basically means having two Rust dialects, and having a subset of libraries not compatible with one of the libraries. That's an ecosystem split. It sounds pretty terrible...

I don't see it as worse than editions. And besides, you already have an ecosystem split, if you introduce changes that break unsafe code.

I for once would be very curious if you think that your large codebase could have been written to be conformant with strict provenance if the developers would have been aware of it at the time. IOW, is the issue just one of legacy code or is there an expressivity gap?

The feedback I have heard is that stacked borrows is too complex for developers to grasp on a large scale. I don't personally have enough information to evaluate that claim, though.

RalfJung commented 2 years ago

The feedback I have heard is that stacked borrows is too complex for developers to grasp on a large scale. I don't personally have enough information to evaluate that claim, though.

Yeah that's why I was asking about strict provenance, which is rather simple (I think).

pcwalton commented 2 years ago

I mean, what I can tell you is that if you break unsafe code, then I'm going to be on the hook for maintaining internal changes to our local fork of rustc that reverts your changes. Which is a thing I can totally do, I don't mind doing it. But I question whether having major organizations maintain their own internal forks of the language is the road that the project wants to go down.

vexx32 commented 2 years ago

What part of this proposes to break existing unsafe code? I'm not seeing anything here that proposes to break code that currently works.

pcwalton commented 2 years ago

Breaking the assumption that usize can be cast to and from pointers breaks unsafe code.

RalfJung commented 2 years ago

I mean, what I can tell you is that if you break unsafe code, then I'm going to be on the hook for maintaining internal changes to our local fork of rustc that reverts your changes. Which is a thing I can totally do, I don't mind doing it. But I question whether this is the road that the project wants to go down.

Not sure if it was a reply to my question, but it doesn't really answer it... we know there is code out there that does ptr-int-ptr roundtrips. We also know a lot of it does that for lack of an alternative API to do the thing it wants to do. So we want to figure out how much of that code could be written with better APIs instead.

It would be really valuable feedback from your organization if we would learn about whether your code can be expressed in terms of the strict provenance APIs or not. This will help define a memory model for Rust that actually is reliable and works for as many people as possible. Feedback from people that excessively use ptr-int-ptr roundtrips is particularly helpful.

Just refusing to even consider whether any alternative API might cover the needs of unsafe Rust code just as well, however, is not very helpful I am afraid.

RalfJung commented 2 years ago

View this as a data-gathering experiment. It is even quite explicitly described as such at the top of this very thread. I am rather surprised that gathering data is met with fierce opposition. No decisions have been made yet! But I hope we all agree that some decision needs to be made at some point, and that if we gather more data we'll probably make a better decision.

It sounds like you do have some interesting data, we'd be happy to hear about it. :)

pcwalton commented 2 years ago

If there's a checking tool via miri or something, I'd be happy to provide feedback as to how the porting process of some of our internal tools goes.

I'm just saying that it's not feasible to port all of it. It's just an unfortunate reality.

comex commented 2 years ago

So honestly I am just deeply frustrated by of course, a former lang and compiler dev responding with "no we should just do something magic instead" even when I am actively avoiding specifying any language or compiler changes. I am literally just adding new stdlib APIs that make it easier to write unsafe rust code that expresses intent and helps ensure your code is obviously correct. Literally! Optional APIs! That are unstable!

You're not specifying any specific language or compiler changes, but you're explicitly opening the door to a future where the compiler optimizes based on some hypothetical model that is at least somewhat stricter than what we have today. To that extent, the APIs are not truly optional.

RalfJung commented 2 years ago

If there's a checking tool via miri or something, I'd be happy to provide feedback as to how the porting process of some of our internal tools goes.

Miri with -Zmiri-tag-raw-pointers will (effectively) implement strict provenance, but also still enforces the rest of Stacked Borrows. I can look into having a flag that enforces strict provenance without Stacked Borrows.

I was thinking maybe it is possible to at least roughly evaluate this without actually doing even a partial port (by considering whether the ptr-int-ptr roundtrips fit the patterns mentioned in the docs, like tagged pointers), but maybe that is not feasible.

thomcc commented 2 years ago

I think the concern some folks (TBH, myself included) have is that the FAQ states:

I am not saying we are going to break the world right now, but we should explore how bad breaking the world is

But the evaluation criteria is based on how hard (or possible) the code is to rewrite to this model, which assumes the code will be rewritten, but in many cases it won't.

eddyb commented 2 years ago

@joshtriplett wrote, in https://github.com/rust-lang/rust/issues/95228#issuecomment-1084008773:

a FAQ entry for "Why don't compiler backends like LLVM just stop doing this provenance thing entirely so we don't have to track it?" I'd expect that to be a common question, right after "what is provenance and why is it a thing?".

As far as I can tell, the answer to the former question is something roughly like "We don't know a good way to do that.

My understanding is that you have something similar to wasm then (flat address space, unoptimizable memory-wise).

(EDIT: I've hid some wasm details in here, it was too confusing before, click to open)
Wasm is even a bit of an extreme, with how deterministic it is (though I do not think you could *soundly* make it optimizable just by e.g. adding ASLR). The way wasm is defined means you can e.g. cryptographically hash the entire address space, heap and stack included, and assert a specific behavior, before an after executing the wasm code itself. (Okay maybe some implementations will do funny things with floats, but at least the integer side should like this.) In wasm, you can optimize the SSA values, but not what ends up in memory. All of memory is a deterministic array. The stack of local variables with addresses taken to them? You can't change anything about their layout or what values are written. So if they weren't optimized *before* emitting wasm, they're *frozen forever*.

"Provenance" as a whole isn't some arbitrary choice of a model, it's what we call the concept of having "disjoint memory allocations" and pointers that are too dynamic for us to know that they point into one specific allocation (I suppose "alias analysis" is also brought up but that can confuse matters, esp. with C's TBAA and whatnot).

Lately I've come to consider "provenance" near-synonymous with "pointer". What is a pointer but a dynamic name for a location in memory? Which you must still reason about, abstractly?

"Which allocation could a (dynamic) write touch" is a fundamental question in optimizing memory accesses. If a pointer you got from someone else, when written to, could touch your own stack variables, how could those stack variables ever be optimized?

The UB is simply a matter of the "negative space" viewpoint of the invariants in question: you can only touch a memory allocation if there's a "chain of custody" of pointers to it. This is the simplest way to have meaningfully disjoint "memory allocations". (With the additional stuff like ptr2int2ptr as extensions which can be e.g. inefficiently emulated)

If you start from "disjoint memory allocations", then the fundamental operation of accessing memory takes a "memory allocation" and an offset - a dynamic pointer is then logically (alloc: MemoryAlloc, offset: usize) and the alloc.base + offset flat address merely coincidences as the "machine-level" representation on some targets.

Both miri and CHERI do an excellent job at materializing this, and though they differ in certain aspects (CHERI has some granularity that Rust only has with Stacked Borrows, but only miri hides "metadata" etc.), they overlap in one important way: you can't pretend the "memory allocation" part isn't there without emulation (i.e. ptr2int2ptr requires a global map of "leaked as integer" pointers).


That wasn't as compact or smooth as I'd hoped, but those are at least my thoughts anyway, having been exposed to CHERI in the past few months. The main thing I wanted to push back on is "we don't know".

In fewer words: I'm not aware of any room for there to even be anything between "one allocation" (like wasm) and "arbitrarily many allocations", that can describe e.g. stack variables (as opposed to coarser segmentation). "Provenance" is merely the "word that lost a bet" and ended up used as a blanket term for the consequences.

pcwalton commented 2 years ago

Miri with -Zmiri-tag-raw-pointers will (effectively) implement strict provenance, but also still enforces the rest of Stacked Borrows. I can look into having a flag that enforces strict provenance without Stacked Borrows.

Thanks, that's helpful. I'll bring up the possibility of testing some of our code with miri internally and let y'all know.

khionu commented 2 years ago

A lot of this conversation is about Strict Provenance as it may be in a final state. Gankra is being careful to make sure this is an iterative exploration. I'd like to encourage everyone to not get too far ahead of the state of the issue.

RalfJung commented 2 years ago

You're not specifying any specific language or compiler changes, but you're explicitly opening the door to a future where the compiler optimizes based on some hypothetical model that is at least somewhat stricter than what we have today. To that extent, the APIs are not truly optional.

They are definitely 100% optional right now.

The lingering concern here seems to be that they might become mandatory in the future. That's fair. But I think gathering this data is still valuable; even if ptr-int-ptr roundtrips end up still working (which was never officially blessed AFAIK but always understood by everyone -- including myself -- to be okay[1]) we'll end up with better APIs for many cases, I think.

I would totally understand the outrage if someone would make a first move towards actually mandating anything like this. And maybe some people think this is such a move. It is not. I assure you, the intent of this experiment is not to "slowly boil the frog" and sneakily introduce extra rules slowly without anyone noticing.

These APIs will not become mandatory with a huge amount of discussion that y'all will definitely hear about. And I don't think even starting those discussions is anywhere close to being on the table for quite a long time.

[1]: Heck I spent a lot of effort making them work well in Miri!

khionu commented 2 years ago

Another gentle reminder I hope everyone can keep in mind, this is not something that is going to change overnight. We have time to talk about this and ensure all concerns are given twice the consideration they deserve.

pcwalton commented 2 years ago

In fewer words: I'm not aware of any room for there to even be anything between "one allocation" (like wasm) and "arbitrarily many allocations", that can describe e.g. stack variables (as opposed to coarser segmentation).

But there is something in between: what existing C, C++, and Rust compilers do. Yes, that memory model is formally inconsistent. But it has also been extremely successful--so successful that it isn't practically possible to go back to one of the two extremes. I'm much more interested in figuring out how to bring the model that we have to some sort of more correct state, which will undoubtedly be messy and inconsistent, because we're stuck with that model no matter what.

RalfJung commented 2 years ago

After thinking about this some more, I think I found a good way to describe this effort and why it is not a threat to existing unsafe Rust code.

I imagine that we might one day have a "spec for Rust that conforms to strict provenance". That fragment of Rust is a lot easier to specify than "full" Rust with ptr-int-ptr roundtrips. I would argue it is better to have a precise spec for parts of Rust than to have no precise spec at all because it is held back by this one nasty issue. Whether that is true depends on how much Rust "out there" falls into the fragment of "conforms with strict provenance", which is a question this experiment aims at answering.

Like all actual experiments, some of the next steps depend on its results!

In other words, I imagine a future situation where ptr-int-ptr roundtrips are not UB, but they are "outside of the fragment of the language that we understand properly" and the UCG WG will use a lot of hedging if you ask too probing questions about what you can and cannot do with them. That actually means they are literally no worse off than today! However, everyone who can avoid them (e.g. through these new shiny APIs) is much better off than today as they actually have a precise spec.

The goal is not to make things any worse than they currently are for existing unsafe code. The goal is to make things better for the fragment of unsafe code that doesn't need the most wild aspects of unsafe programming, and to bring more and more unsafe code into that fragment. And then who knows maybe one day we'll crack ptr-int-ptr roundtrips and all of unsafe will live in the glorious land of "having a precise spec". :D

I hope this puts people with large existing unsafe Rust codebases at ease. :)

joshtriplett commented 2 years ago

@Gankra

that never materializes because the lang team has no interest in unsafe rust

FWIW, several of us on the lang team care a great deal about unsafe Rust, myself included. This specific issue has been and continues to be a massive challenge to get right. It has come up numerous times, and it's still not clear what the best solution is. Every path has massive tradeoffs.

With my lang team hat on: thank you for trying this experiment, and I'm very interested to see it.

eddyb commented 2 years ago

But there is something in between: what existing C, C++, and Rust compilers do. Yes, that memory model is formally inconsistent. But it has also been extremely successful--so successful that it isn't practically possible to go back to one of the two extremes. I'm much more interested in figuring out how to bring the model that we have to some sort of more correct state, which will undoubtedly be messy and inconsistent, because we're stuck with that model no matter what.

I'm sorry, but I feel like my comment didn't properly get across. I was replying to "are there alternatives to provenance" with "no" (with only minor caveats). Not strict provenance, but any provenance. You're talking about a specific provenance model.

RalfJung commented 2 years ago

I was replying to "are there alternatives to provenance" with "no" (with only minor caveats).

FWIW I disagree. You won't get restrict or anywhere close to Stacked Borrows without provenance (i.e., no fancy aliasing rules, and no competing with Fortran performance), but there is a huge gap between wasm and what you can do by exploiting allocator non-determinism. (However, this does require accepting a fully non-deterministic allocator, which low-level people also find hard to stomach sometimes. In particular those that implement allocators. :joy: )

But that takes us too far for this thread. I created a topic on Zulip.

eddyb commented 2 years ago

I was replying to "are there alternatives to provenance" with "no" (with only minor caveats).

FWIW I disagree. You won't get restrict or anywhere close to Stacked Borrows without provenance (i.e., no fancy aliasing rules, and no competing with Fortran performance), but there is a huge gap between wasm and what you can do by exploiting allocator non-determinism. (However, this does require accepting a fully non-deterministic allocator, which low-level people also find hard to stomach sometimes. In particular those that implement allocators. joy )

But that takes us too far for this thread. I created a topic on Zulip.

I replied more on Zulip to that specific concern, but overall, I likely caused two kinds of confusion, sadly:

skade commented 2 years ago

I want to share a couple of observations and views on this discussion:

1) I think it's inappropriate to dismiss features like CHERI as "niche" for the sake of a discussion. Everyone is informed by the space they work in. In the space I am currently in - high assurances - CHERI is a deciding factor for choosing new hardware platforms. Can I show you the documents? No, so it's claim against claim. Also, CHERI is new and we - as still a niche language! - should know how large niches can be and how quickly they can grow. 2) The same goes for vague comments about internal codebases and the complexity to apply certain patterns to them. If you want to discuss them, minimise to an example and extract them for public discussion. This is expected behaviour in many engineering circles. I expect this work of everyone who wants to sit at this table. 3) I'm worried about ascribing intent and interests to certain teams. I'm sure there's a lot of frustration that a team doesn't cover the particular need one cares about. I have a laundry list of those things. But I am aware that "the team cares" and "it currently has different priorities and only limited time" can both be true at the same time and as long as I'm not on the team, I should be very careful to casually throw opinions around.

Finally, I want to remind some in this discussion that both you and your employers are large names in the Rust ecosystem. Casually threatening certain actions is inappropriate and will reflect on either your or your employers name. We already have the meme "corporations own Rust" out there and anyone willing to casually using their employers size gives this meme more weight. You earned your name and your jobs, but with that comes great responsibility.

That being said, from the perspective of someone currently qualifying Rust for high assurances and been training Rust for 7 years now: a stricter pointer model is very much desireable, because the current one is indeed extremely messy to the point where validation by looking at assembly is a recommended chore. The current one is not easy to apply and review. The ease of applying a new model is a concern, but from my perspective a lesser one, because those implementations will be costly anyways. A stricter model also allows for better automated tooling. The maintenance of existing codebases from our perspective is of little concern, as the ecosystem is not yet build - so having such a model rather earlier than later is of interest to us.

CHERI support is indeed something we'd like to see covered, it's a major gap.

programmerjake commented 2 years ago

one thing to maybe add to the list of stuff MMIO pointers should be able to do: use the MMIO address space as backing memory for a memory allocator...e.g. some embedded platforms have special sram blocks at specific addresses...also most modern GPUs can map their video memory into specific regions of cpu-visible address space, e.g. via PCIe BAR. i think there are definitely cases where video memory read/write shouldn't need volatile, e.g. you pass the pointer to the gpu so it's more like passing the pointer to another cpu thread than it is like writing to a MMIO serial port's transmit register.

marmeladema commented 2 years ago

Leaving aside the memory model discussion, I read the PR and I truly think the newly introduced APIs are better in that they are clearer to read, easier to document and less error prone (since you won't silently be shrinking some pointers to a smaller integer etc).

Even if the strict provenance experiment would fail in some way, I would very much like to see those APIs stabilized anyway (and cast deprecated) since they allow you do to do the same thing, just in a better and more controlled way.

Diggsey commented 2 years ago

Crazy idea: what if we defined optimization levels according to the model which the corresponding optimization passes were compatible with?

So for example, the highest optimization level would run all optimization passes which are valid against the strict provenance model presented here. There could be a lower optimization level which simply excluded all passes which are invalid under the weaker model of PNVI-ae-udi. There could be a lower level still which is basically WASM (no provenance at all).

The compiler could limit the optimization level based on the crates in your crate graph (eg. based on editions). This would motivate the adoption of the new model through the performance and validation benefits possible under the new model without breaking old code, and it would still be possible for users to open miscompilation bug issues under the lower optimization level.

bgeron commented 2 years ago

That's a wonderful idea!

To elaborate: can we opt-in to strict provenance per-crate in Cargo.toml?

# default: true = allow = PNVI-whatever
ptr2int2ptr = true

This would be nice for legacy codebases, sloppy FFI, and other Use-cases for unrestricted ptr2int2ptr, while limiting the damage.

In my opinion, we can regress a little in performance of ptr2int2ptr = true code, if it means we can simplify rustc. But ideally ptr2int2ptr doesn't break 100%.

edit: made the flag a boolean

RalfJung commented 2 years ago

Strict provenance is a memory model question, which means it is a global choice.

Put differently: if any crate opts out (or does not opt in), the entire program has to be compiled without strict provenance.

Diggsey commented 2 years ago

Right, I think we'd still want to be clear that (assuming this all goes through) we think the strict provenance model is the future - that it's something that all crates should aspire to support, but it at least gives us a way to guarantee that existing code can continue to work.

Beyond performance and validation, there's another benefit to this approach: there are certain areas of the Rust abstract machine that aren't fully figured out yet (eg. how FFI interaction is modelled in the Rust abstract machine) that mostly relate to the expecations lower-level code can have when we map the abstract machine onto a concrete architecture, and these questions are a lot easier to answer in the weaker memory models, so it would provide a path to begin using Rust for these areas before we've fully answered those questions (at a performance cost) and that usage could actually help us figure out what's important when answering those questions in the stricter memory models (again with the goal to eventually support every use-case in the strict provenance model).

Finally, I think it would be nice to have a more formal definition for what constitutes an optimization level, rather than individually toggling passes as is frequently done in C/C++.

bstrie commented 2 years ago

Please note that tracking issues are not for general discussion. Github issues are infamously, notoriously poor at hosting long and branching discussions. I ask that people please use the t-lang/-unsafe-code-guidelines Zulip channel for further discussion. For ease of reference, here again are the existing topics relevant to this discussion; please make a new one if none of these suffice:

As a favor to all of our future selves I will be presumptuously limiting further discussion here to collaborators; please keep further comments focused on issues directly relevant to the tracking issue. Open new issues here in the issue tracker if you have a specific issue that needs to be addressed.

Gankra commented 2 years ago

I have updated the initial comment to clarify that this is a set of library APIs and that there cannot be any implications for lang/compiler semantics because those teams are wholy uninvolved. To the extent that a memory model is "proposed" it is to provide a Narrative Framing for why you want to use these APIs. The experiment is voluntary "memory model roleplay" to encourage users to make their intent clear, so that we can understand what they are having trouble expressing with "proper" pointer APIs.

Gankra commented 2 years ago

Also I will no longer be helping with this experiment.

I wrote the APIs. I wrote the docs. I made a stable polyfill so everyone can experiment with them right away. I migrated the stdlib/compiler to mostly comply to these "rules" to prove that they're relatively easy to comply with (and that most code does already). I listened to everyone's concerns and facilitated discussion, figured out the biggest issues, and filed subtasks for them.

I have done everything I can, it is now up to the larger rust ecosystem and domain-specific stakeholders to figure out to what extent this is "useful" and what needs to be done to fill in the semantics gaps.

bstrie commented 2 years ago

Relevant to this issue is a new proposed option in Miri for executing Rust code under strict provenance: https://github.com/rust-lang/miri/pull/2045

RalfJung commented 2 years ago

https://github.com/rust-lang/rust/pull/95588 expands the APIs and docs to also better explain what happens with non-Strict-Provenance code that does want to do pointer-usize-pointer roundtrips.

mvtec-bergdolll commented 2 years ago

Initially just reading the documentation I found expose_addr very confusing and it took me some time to spot the subtle difference to addr. Expose in the name made me think somehow it was connected to the return type/value. Some alternatives:

mvtec-bergdolll commented 2 years ago

Thinking a bit more about it,

digama0 commented 2 years ago

Using a different word than "provenance" is not a good idea. This is a term-of-art: the fact that it does not have a common use in english is a strength since it makes it easier to find references to the technical concept online. Using a synonym will only confuse matters, since once you start digging into the concept you will soon have to consult documentation in other sources like LLVM or the C specification and the word "provenance" is used there.

The same thing applies to "expose": I first used the term "broadcast" for this operation, but I switched to "expose" to match usage of the word in C. It's a technical term so there is no getting around the fact that users will have to read the documentation to understand the difference, especially since it's not observably different from addr() except in exotic optimization scenarios.

workingjubilee commented 2 years ago

I think if we come up with new terms of art, that's fine. But I say new because they would have to not also risk semantic collision: leak implies it is the same as other leak operations, like Box::leak, which leak allocations, and it is very not. We cannot afford confusing these matters here, so I think leak is right out.

guess implies the wrong semantics, also, because "guess", in human terms, refers to applying an intelligent heuristic. However, here, provenance-selection can be a very "simple" heuristic: my understanding is that if there is only one exposed provenance, that is the provenance used. It doesn't matter whether that "makes sense".

RalfJung commented 2 years ago

In a sense this "guessing" is actually extremely intelligent -- if there is any correct guess, it will be taken! Here, "correct" is defined as "avoiding UB".

digama0 commented 2 years ago

60% joking: recover_provenance_by_magic(). (I like that this will make people really look askance at any uses of the function in code review.)

ShadowJonathan commented 2 years ago

2c for expose_addr replacement; pluck_addr, "plucking from thin air"