rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
658 stars 57 forks source link

How are virtual function calls specced? #338

Open JakobDegen opened 2 years ago

JakobDegen commented 2 years ago

To make the question more precise, consider this code:

trait A {
    fn foo(&self, u32) -> i32;
}

fn nameless(r: &dyn A, x: u32) -> i32 {
    (&dyn A)::foo(r, x)
}

What does the spec say happens at the (&dyn A)::foo(r, x) call?

Whatever we choose must be strict enough to disallow inserting an extra println!("AAAAA"); that the user did not write, but also permissive enough to not stabilize the layout of vtables.

This is forked off from #328 .

chorman0773 commented 2 years ago

Unspecified behavior and nondet choice are two very different things, btw.

In C++, unspecified behaviour is defined as the nondeterministic aspects of the abstract machine, so unless rust is somehow different in this regard, I'd consider the two synonymous.

Nondet has some very specific interactions with the notion of correct compilation ("refinement") that unspecified behavior does not have.

Would you be able to share some examples, please?

Diggsey commented 2 years ago

That being said, if you write code for a particular implementation of rust, then you can rely on both explicit or implicit promises it makes refining the rules of rust

Then why can't the standard library rely on the fact that the implementation makes these promises (at least for now) in order to implement DynMetadata?

Calling a virtual function is not an unstable feature.

No, but reading from a vtable within the AM could be an unstable feature (which is then used by std).

I would like it if we could make "your code may become UB in the future" (ie. you are relying on an unstable feature) and "your code is UB today" two distinct results rather than conflating them.

RalfJung commented 2 years ago

In C++, unspecified behaviour is defined as the nondeterministic aspects of the abstract machine, so unless rust is somehow different in this regard, I'd consider the two synonymous.

I guess Rust is different then.

Nondeterminism is a runtime thing. It is an action taken by the program as it runs on the Abstract Machine. The compiler is allowed to "refine" non-determinism by reducing the set from which the choice is taken from during compilation, and even completely resolve the choice to a concrete value. The most canonical example for this in Rust is the address of a freshly created allocation.

If we said that struct layout was non-deterministic, that would mean that each time you start a Rust program, it would be allowed to pick fresh offsets then and there. (Or even each time a struct value is created? Non-determinism is an effect, you have to say when it happens. "Struct layout is non-deterministic" is not a well-defined statement.) That is not the specification we want.

Maybe you mean that compilation is non-deterministic, but then you have to be more precise and distinguish between non-determinism during compilation and non-determinism during execution of the Abstract Machine program.

Then why can't the standard library rely on the fact that the implementation makes these promises (at least for now) in order to implement DynMetadata?

We have not decided to make any such promises for rustc yet. Also note that I said "I could imagine" and "maybe one day". I didn't say I think any of these are good ideas or reflect how we think of Rust today.

Also we care about running this code in Miri which aims to be a faithful implementation of the Rust Abstract Machine.

I don't think the notion of an "unstable AM feature" makes much sense. The purpose of the AM is to specify which programs you can write. Either the AM says my program is fine, then it must stay fine, or it says the program is not fine, then I need to fix it. We could of course have multiple different AMs but that would be a mistake IMO, that's basically forking the language (akin to -fno-strict-aliasing in C).

The standard library runs on the same AM as everyone else. It it okay to exploit layout assumptions there since those are compile-time decisions, but it in my opinion is not okay to have UB in the standard library, no matter how much you know about the compiler. (I know authors of some other compilers disagree, but in rustc this is the policy we follow, AFAIK.)

chorman0773 commented 2 years ago

On Mon, Jul 18, 2022 at 16:26 Ralf Jung @.***> wrote:

Nondeterminism is a runtime thing. It is an action taken by the program as it runs on the Abstract Machine. The compiler is allowed to "refine" non-determinism by reducing the set of to choose from during compilation, and even completely resolve the choice to a concrete value. The most canonical example for this in Rust is the address of a freshly created allocation.

You can have unspecified runtime behaviour. In fact, I'd argue that unspecified behaviour is fundamentally a runtime concept, except as it relates to constant evaluation.

If we said that struct layout was non-deterministic, that would mean that each time you start a Rust program, it would pick the offsets then and there. That is not the specification you want to write.

I'm unsure this argument holds. I would argue, that the implementation is in fact free to alter the layout of structures between executions, bounded on the fact that layouts can be examined (and thus observed and fixed) during translation (for example, by a const-eval panic or a type mismatch, such as the ones used by the const_assert! et. al macros). Thus, the implementation could alter the layouts of any type which is not so observed between executions or otherwise by translating the program again between executions (which would occur, for example, implementation was an interpreter that makes no meaningful distinction between translation and execution - such is the case with miri). I would argue that the program observing distinict choices for unspecified layout would be fundamentally the same as the program observing distinct non-deterministic choices at runtime, barring the fact that such observations can alter the properties of the program during translation, rather than merely during execution (such as making the program ill- or well-formed).

Maybe you mean that compilation is non-deterministic, but then you have to be more precise and distinguish between non-determinism during compilation and non-determinism during execution of the Abstract Machine program.

Then why can't the standard library rely on the fact that the implementation makes these promises (at least for now) in order to implement DynMetadata?

We have not decided to make any such promises for rustc yet. Also we care about running this code in Miri which aims to be a faithful implementation of the Rust Abstract Machine.

I don't think the notion of an "unstable AM feature" makes much sense. The purpose of the AM is to specify which programs you can write. Either the AM says my program is fine, then it must stay fine, or it says the program is not fine, then I need to fix it. We could of course have multiple different AMs but that would be a mistake IMO, that's basically forking the language (akin to -fno-strict-aliasing in C).

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/unsafe-code-guidelines/issues/338#issuecomment-1188270538, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGLD2ZN6G3JBXUBM3JSRU3VUW4WDANCNFSM5V6EAHEA . You are receiving this because you commented.Message ID: @.***>

Diggsey commented 2 years ago

The purpose of the AM is to specify which programs you can write. Either the AM says my program is fine, then it must stay fine, or it says the program is not fine, then I need to fix it.

Well that's kindof my issue. If the choice is so binary then there is no room for implementation-specific programs to exist at all, and that is a problem because I believe a lot, if not most, useful programs rely on some behaviour that is implementation-specific.

There is a class of programs which will never work on MIRI but which we should not consider to be "wrong". What do we do for inline assembly here? Surely we say that such programs are simply "out of scope" of the abstract machine, without saying they definitely have UB?

RalfJung commented 2 years ago

there is no room for implementation-specific programs to exist at all

That's not true. As I keep saying, struct layout is implementation-specific (unspecified, whatever you want to call it). I also disagree with your thesis that most useful programs rely on implementation-specific behavior, unless you mean rely on it transitively through the standard library. Outside the standard library me and many others generally consider relying on implementation-specific behavior a soundness bug.

But I don't think vtables and vtable access should be implementation-specific, at least not in the form of allowing programs to read from the vtable. Instead what I do in https://github.com/rust-lang/rust/pull/99420 is add some implementation-specific intrinsics that can be specified on the AM in a coherent way. This avoids the need for "implementation-specific UB" or anything like that. Now Miri can fully enforce opaque vtables and run stdlib code that needs to fetch the size from a vtable.

RalfJung commented 2 years ago

There is a class of programs which will never work on MIRI but which we should not consider to be "wrong". What do we do for inline assembly here? Surely we say that such programs are simply "out of scope" of the abstract machine, without saying they definitely have UB?

I don't think this is the right thread for fundamental discussions about the limits of Abstract Machine specifications, but there is a coherent way to integrate inline assembly and C FFI into this picture. (Basically, the author of the code has to "axiomatize" what transition on the AM state is implemented by this "gap in the code". That transition needs to be one that could also have been taken by regular Rust code, to ensure correctness of compiler analyses.) Indeed Miri will not be able to run such code.

So yes we could probably have implemented those parts of the stdlib via inline assembly (though the "must be possible in regular Rust code" part could become tricky, if a compiler analysis wants to actually rely on the fact that regular Rust code can never read the size field -- but now that we have an intrinsic for this, that is an operation the optimizer has to take into account anyway). I think an intrinsic is much cleaner though.

dimpolo commented 2 years ago

I have a weird question, hopefully it's alright to ask here. I have a use case where I want to perform something similar to upcasting a ThinBox. Since the ThinBox stores the vtable pointer inline, normal upcasting can not work. However I was hoping that for some combinations of traits where it is known, that upcasting does not change the vtable pointer, a transmute would be fine.

So my question would be: Is it possible to include the notion of a trivial upcast into the spec (and the implementation of MIRI) so that code like the following is both sound and deterministic?

#![feature(ptr_metadata)]
#![feature(thin_box)]
#![feature(trait_upcasting)]

use std::boxed::ThinBox;
use std::ops::Deref;

trait A {
    fn a(&self);
}

trait B: A {
    fn b(&self);
}

impl A for i32 {
    fn a(&self) {
        println!("a");
    }
}

impl B for i32 {
    fn b(&self) {
        println!("b");
    }
}

fn main() {
    let x: ThinBox<dyn B> = ThinBox::new_unsize(5);

    x.b();
    if let Some(x) = try_upcast(x) { // MIRI returns None :(
        x.a()
    }
}

fn try_upcast(x: ThinBox<dyn B>) -> Option<ThinBox<dyn A>> {
    // somehow check that dyn A vtable is a subset of the dyn B vtable
    // would be nice to have this as an intrinsic
    let fat_ptr: &dyn B = x.deref();
    let meta_1 = core::ptr::metadata(fat_ptr);
    let meta_2 = core::ptr::metadata(fat_ptr as &dyn A);
    let upcast_is_trivial = format!("{meta_1:?}") == format!("{meta_2:?}");

    if upcast_is_trivial {
        // manual "upcast"
        Some(unsafe { core::mem::transmute(x) })
    } else { None }
}
bjorn3 commented 2 years ago

We don't guarantee any encoding of vtables. The fact that the vtable of B contains a valid vtable of A is just an optimization and should not be visible to end users. As such checking that it is a subset should remain UB IMHO.

dimpolo commented 2 years ago

As such checking that it is a subset should remain UB IMHO.

I was hoping we could say that the result of checking is unspecified but consistent for one particular implementation.

Something along the lines of how TypeId works:

While TypeId implements Hash, PartialOrd, and Ord, it is worth noting that the hashes and ordering will vary between Rust releases. Beware of relying on them inside of your code!

RalfJung commented 2 years ago

No, currently vtables are deliberately unobservable in the program -- any attempt to even read from a vtable using a regular load is UB.

I have no idea what the debug printer of metadata shows -- but it is definitely not okay to rely on it for anything.

If there is a usecase for directly accessing the vtable from the program, that needs to be carefully specified and RFC'd first.

dimpolo commented 2 years ago

any attempt to even read from a vtable using a regular load is UB.

I understand, but this is a little different.

No access to the vtable is happing, just a comparison of the address of the vtable.

The debug printer just prints the vtable address. In real code I would transmute the metadatas to opaque pointers and compare them. (I just tried to limit the unsafe blocks for the example)

Here's another way of phrasing the question:

Is it ok to observe the address of a vtable and if yes is it safe to transmute a DynMetadata<dyn B> to a DynMetadata<dyn A> if they point to the same vtable?

RalfJung commented 2 years ago

No, that's not safe either. vtables can be deduplicated, so if two types have the same size and alignment and a NOP drop, they might share the vtable.

In real code I would transmute the metadatas to opaque pointers and compare them.

That's incorrect, too -- the transmute might fail in the future if the size of the metadata changes.

(In the future please open a new issue unless you are sure it is the same question as another issue. Now we have two completely independent discussions mixed in this overly long thread.)