rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.94k stars 1.57k forks source link

Allow (unsafe) static cast from unsafe pointer to static reference #1691

Open alevy opened 8 years ago

alevy commented 8 years ago

There should be some way to (unsafely) cast an unsafe pointer to a 'static reference in static variables.

In scenarios like where you want to wrap a set of hardware registers in a struct, it can be very useful to initialize that it statically. It's currently almost, but not quite, possible to this.

For example, if I want to wrap the registers in a struct like:

struct Registers {
  status: VolatileCell<usize>,
  control: VolatileCell<usize>,
  ...
}

It's impossible to initialize the following global variable from a memory mapped I/O address:

static GPIO: &'static Registers;

The two options for casting from a *const T to a &'static T are:

  1. transmute which is not a const-fn and therefore can't be called from a static initializer.
  2. &* which is not allowed in statics because the initial deference cannot be done at compile time (although the combined operation is clearly possible since it's just type-cast).

There are currently two workarounds I'm aware of. First, you could declare the memory mapped I/O address in an external library (in C, assembly, or maybe a linker script) and import it using an extern block. This has the disadvantage of allowing any crate to import the same symbol, even with different types (it doesn't even require unsafe). The second option is to define a special reference type, e.g.:

struct StaticRef<T>(*const T);

impl<T> StaticRef<T> {
    pub const unsafe fn new(ptr: *const T) -> StaticRef {
        StaticRef(ptr)
    } 
}

impl<T> Deref for StaticRef<T> {
    type Target = T;
    fn deref(&self) -> &T { unsafe { &*self.0 } }
}

(Please don't use the above code uncritically if you come upon this issue, I haven't thought hard about whether this is a safe interface)

steveklabnik commented 8 years ago

Do const fns help here?

alevy commented 8 years ago

@steveklabnik unfortunately no, because you run into the same issues. You can't transmute in a const fn and you can't dereference an unsafe pointer in a const fn

steveklabnik commented 8 years ago

Hm, I guess I was thinking "you define a static and initialize it with const fn, and then create your &'static T by referencing that static, since references can refer to other statics." I am very willing to assume that I'm missing some important details though :)

alevy commented 8 years ago

@steveklabnik oh oh, i think that might work if your static is actually initializable in Rust, but e.g. if it's memory mapped I/O, it's not -- the only thing you can do is cast an address to a reference.

steveklabnik commented 8 years ago

Ah ha! That's the bit I'm missing :+1:

oli-obk commented 8 years ago

cc @mbr

@alevy: What you are trying to do is to replicate the way C does register access by writing to globally available addresses. Have you considered utilizing Rust's ownership to create your register structures at program start and pass them on in owned form, so you get the static guarantee that noone (outside of unsafe code) access the same register "simultaneously"?

mbr commented 8 years ago

If I may add, I've looked at this recently, check out chapter 5 of http://embed.rs/specs-of-rust-rc1.pdf

Maybe that is what you are looking for? I have more actual code, though it's unpublished (and not stapled to the PDF).

If desried, I can publish the actual register/volatile module.

phil-opp commented 8 years ago

The lazy_static macro could be another workaround. It has a spin_no_std cargo feature for no_std crates.

mbr commented 8 years ago

@alevy

I've uploaded the actual volatile implementation bit and release it in a single crate called embedded: https://crates.io/crates/embedded

You can view the source at https://github.com/mbr/embedded-rs/blob/v0.2/src/base/volatile.rs, if you pin to version 0.2.0 you'll be guaranteed to get just that bit of code.

The crate was intended to be called embed when I started a week ago, but I did not register the name on crates.io. Unfortunately, since 2 days ago, there is already an unrelated embed crate.

alevy commented 8 years ago

@oli-obk Yes, we've tried that extensively (http://amitlevy.com/papers/tock-plos2015.pdf). That's problematic if you have a system with circular references. A common way of avoiding this is to separate components into threads and transform those references into communication (e.g. sending an event across a channel), which is fine in some cases, but not others.

Notably, it is perfectly safe (in terms of type-safety) for multiple things to reference (and mutate) something of type Registers (hence the use of a shared reference for GPIO and VolatileCell for the fields which is basically a Cell with volatile_load and volatile_store operations).

Access to resources is an orthogonal issue. It's nice when this can be enforced with ownership (because it's elegant), but hiding globals through the module system or within scopes is also sufficient. A virtualized controller will (at some level) need access to registers and other state from multiple virtual devices (unless, again, you're willing stick each thing in something like a thread a communicate via channels or IPC or something).

It's possible to still use ownership as access control, but I think you have to do something much more nefarious -- you'd have to create multiple owned values that reference the same hardware registers and rely on the semantics of the registers (rather than the borrow checker) to know you're not violating type safety (although in practice I think you'd be fine if it's MMIO in any cases I can think of).

alevy commented 8 years ago

@mbr Thanks this looks like a nice crate we might be able to use instead of rolling our own VolatileCell (well copied from Zinc anyway). I'm not sure I understand how it addresses the concern though. I'm probably missing something, sorry. Which part of the chapter?

alevy commented 8 years ago

@phil-opp Indeed lazy static also addresses a similar problem, but It's pretty expensive relative to a (often elided through inlining) pointer dereference. I haven't disassembled it, but call_once seems like it would be at least 10 or so instructions (on ARM compiled using my head, so... maybe not). This might become much cheaper if it didn't have to deal with concurrent accesses, but then I think you end up almost with StaticRef anyway.

Otherwise, I believe it's a similar approach to the same problem.

oli-obk commented 8 years ago

On a side note to the original post:

&* which is not allowed in statics because the initial deference cannot be done at compile time (although the combined operation is clearly possible since it's just type-cast).

it would certainly be possible to change const eval to allow derefing any pointer, if it is immediately referenced again.

It's possible to still use ownership as access control, but I think you have to do something much more nefarious -- you'd have to create multiple owned values that reference the same hardware registers and rely on the semantics of the registers (rather than the borrow checker) to know you're not violating type safety (although in practice I think you'd be fine if it's MMIO in any cases I can think of).

I don't think that's nefarious. We depend on RefCell for runtime upholding of the invariants, even better if Register doesn't even need these runtime checks. A SharedRegister type that is cloneable and can be created by consuming an owned Register would make the sharing obvious. The SharedRegister would then not implement Send, and all ops that transfer objects to or from an interrupt need to require Send for those objects.

I only read your paper up to the end of 3.1 (Resource Ownership), so I might be missing more info due to the closure stuff. I'll read the entire paper on the weekend

strega-nil commented 8 years ago

@oli-obk basically, lvalue->rvalue conversions from arbitrary lvalues should not be allowed.

mbr commented 8 years ago

@mbr Thanks this looks like a nice crate we might be able to use instead of rolling our own VolatileCell (well copied from Zinc anyway). I'm not sure I understand how it addresses the concern though. I'm probably missing something, sorry. Which part of the chapter?

If I understood the motivation for execution contexts correctly, the main motivation was shared access to hardware resources? The actual is probably in following chapters --- I fully intended the document to be more of an introduction for people of all sorts that are not yet convinced that embedded development in Rust is a great idea. Some advanced parts are missing, in hindsight that may have been a mistake.

To solve the problem, I would suggest not to use any global/static variables at all, but structure it the same way you would a "regular" application. The thesis proposes autogenerating all instantiations of these memory-mapping structures from known-to-be-correct hardware specs and then passing those into the main() function (or similar). Deconstructing a big "hardware" struct extracts the parts you want to use.

That will result in a non-shared "regular" abstraction over hardware. As an experiment, I have (not published in the paper) then tried to write "lock-like" structures --- basically copying the design of std::sync::Mutex, which is what I assume you'd use in a non-embedded world, higher level constructs like queues aside. An example would be an IRQ lock, upon lock(), it disables all (or a subset) interrupts, returns a guard that re-enables them upon going out of scope.

The upside of the idea was that it is using the idioms that are already known to regular application developers. It includes the possibility for "unsafe cheating" when you need it (i.e. just not disabling IRQs). Would the latter be a way of emulating what you are trying to do with execution contexts, i.e. creating an "unsafe lock" that doesn't do anything (but still acts like a lock, i.e. has a &self lock method) --- which would only be passed to blessed parts of the program, that is those executing in the same thread? From what I understand, execution contexts are about these not escaping their intended area?

I cannot comment on the closure and memory aspects of the paper right now (I'm in the midst of a 1300 km road trip), but I appreciate the feedback.

alevy commented 8 years ago

@oli-obk oh, don't worry about reading the paper fully. Our proposed solution is broken anyway (unless you disallow existential types, which no one wants).

alevy commented 8 years ago

@mbr I agree that this approach makes sense in general, but in some cases you just need to have global variables anyway to avoid creating overly-complex structures. I need to read the book you posted more carefully though -- it seems like a very interesting and useful read, so I'd like to anyway.

If you read our paper, please take what we say with a grain of salt. Our insights there are outdated at this point -- basically our solution is wrong, and can mostly be solved with cell types.

I think all of this is more or less orthogonal to the discussion of allowing statics (this includes const fns) to unsafely cast *const to &'static (they can do other unsafe things). I'm becoming convinced that StaticRef as I posted originally basically allows you to do this anyway (and maybe this is sufficient instead of a change to the language or an addition to core). While avoiding globals is good practice, it's just unavoidable in some cases.

mbr commented 8 years ago

@mbr I agree that this approach makes sense in general, but in some cases you just need to have global variables anyway to avoid creating overly-complex structures.

Unfortunately I don't have enough experience with RTOSs to be familiar with these cases. I'll have to look into these when I run into them.

I think all of this is more or less orthogonal to the discussion of allowing statics (this includes const fns) to unsafely cast *const to &'static (they can do other unsafe things). I'm becoming convinced that StaticRef as I posted originally basically allows you to do this anyway (and maybe this is sufficient instead of a change to the language or an addition to core).

I would be interested in how that turns out. If you look at chapter 5 again it tries a similar approach initially, though without any const fns. In my case, it turned out to be impractical due to the large overhead of storing an extra pointer for each field, without being sure that they can be optimized away. You could probably get away with it when only dealing with "top-level" structures, but this breaks down if things become deeply nested. Mixing it with the Volatile fields could work though.

bergus commented 6 years ago

Is there any progress on this (allowing &* in const or making transmute a const fn)?

alevy commented 6 years ago

One update is that we've started using StaticRef more ubiquitously in Tock for memory mapped I/O registers. It's a fairly recent change, so it's maybe still TBD how it goes, but so far it's allowed us to get rid of a ton of unsafe casts inline and replace it with one unsafe each at the point of defining the reference (which is where you'd audit that you're casting a memory location to the Right(tm) Rust struct).