Unions with unsized variants

mahkoh commented 3 years ago

I believe it would be technically feasible (easy, even) to have unions with at most one unsized variant.

Unsizing should be safe:

1) T -> dyn U: The vtable contains 2 values relevant for the unsized union: 1) The alignment: This is stored directly in the vtable of T, accessing it does not access the data pointer 2) The size: Same.

Note that drop_in_place is not relevant because unions do not drop their contents. 2) [T; N] -> [T]: The metadata consists of the array length which is statically known and does not require accessing the variant.

@eddyb: In the last paragraph here you wrote that there is a problem with unsized unions but I haven't been able to figure it out. Maybe you were talking about unions with more than one unsized variant?

RustyYato commented 3 years ago

The problem is that the meta-data in the fat pointer won't always be valid, i.e. what should this do?

union Foo<T: ?Sized> {
    value: T,
    uninit: (),
}

let foo = Foo::<i32> { uninit: () };
let foo_ref = &foo as &Foo<dyn Any>; // the fat pointer meta-data isn't valid for the data actually stored

Also, if you ban accessing the data, it makes it harder to implement something like Custom DSTs. Custom DSTs are important because they would allow CStr to be a thin pointer. (Is unsized unions that important? I don't know, maybe, maybe not)

mahkoh commented 3 years ago

The problem is that the meta-data in the fat pointer won't always be valid

Yes, and? Afaik, the only parts of the metadata that MUST be valid are size and alignment so that the layout can be computed at runtime. The other parts (drop_in_place and trait methods) cannot be accessed without first accessing the unsized field. Accessing the wrong field is already undefined behavior so that the metadata being invalid does not add any additional unsafety.

Lokathor commented 3 years ago

I'm not clear on why CStr can't already be a thin pointer.

burdges commented 3 years ago

Any map T -> dyn U for T: !Sized sounds quite extreme for Rust. It'd require Box, et al. magically copy the vtable somewhere, not sure that's zero cost, and really not sure what all that breaks.

I think previous discussions indicate that truly unsized types should panic whenever accessing their size or alignment, so align_of_val and size_of_val should never access their data pointer.

If I understood, Rust could add some TrueCStr (name?) which yields thin pointers, and maybe even makes align_of_val work, but for which size_of_val panics. It'd still have a method like strlen and a conversion into slices.

kennytm commented 3 years ago

but for which size_of_val panics

that would make Box<CStr> crash

mahkoh commented 3 years ago

Please let's stop this academic discussion around CStr. CStr will never be changed simply because it would

1) Completely change the performance characteristics of a core type that is relied upon by lots of production code. 2) Break all code that relies on transmuting CStr to [u8] of which I assume there is much.

People who actually want a thin pointer to null terminated strings have written such a thing themselves years ago and have no need for a thin CStr.

Feel free to talk about new forms of metadata in general.

RalfJung commented 3 years ago

I don't know how to even implement this. Imagine

union Example {
  a: str,
  b: String,
}

Now, given &Example (which I suppose is a wide pointer), how do you determine the size?

Note that not even enums can have unsized fields currently, and that seems like a strictly simpler problem since for enums we know which variant they are in. So I think we should specify, implement, and gather experience with unsized enums before venturing into the land of unsized unions.

mahkoh commented 3 years ago

I don't know how to even implement this. Imagine
union Example {
  a: str,
  b: String,
}
Now, given &Example (which I suppose is a wide pointer), how do you determine the size?

How do you construct &Example? Presumably with transmute? In this case you would either transmute it from &str or from [&String, 0]. So the size would be max(metadata, sizeof(String)). Or more generally for [T] fields:

max(metadata * sizeof(T), sizeof(union without the unsized field))

(Rounded up to the correct alignment.)

Automatic unsizing of [T; N] fields would of course set the metadata to the correct value just like unsizing &[T; N] would.

Note that not even enums can have unsized fields currently, and that seems like a strictly simpler problem since for enums we know which variant they are in.

I think enums are significantly more complicated in this case. The &Example above can be constructed via transmute. If it were an enum, this would not be possible.

RalfJung commented 3 years ago

How do you construct &Example?

That doesn't matter for my question. No matter how it is constructed, you have to be able compile code like

fn example1(e: &Example) -> usize { mem::size_of_val(e) }

fn example2(e: Box<Example>) { drop(e) }

in a uniform way, without knowing how the arguments got created.

RalfJung commented 3 years ago

max(metadata * sizeof(T), sizeof(union without the unsized field))

So you are suggesting that the metadata is 0 when the String variant was used to create this enum? Leaving aside that it is completely unclear how to do the unsizing coercion, what do you propose to do if the unsized field is of type dyn Trait? (Note that the vtable metadata field is currently marked nonnull.)

mahkoh commented 3 years ago

So you are suggesting that the metadata is 0 when the String variant was used to create this enum?

We're talking about unions and not enums. The variant does not matter for unsizing and the compiler does not have to know which variant was last written to.

Leaving aside that it is completely unclear how to do the unsizing coercion

Which part is unclear? Given

union X<T> {
    t: ManuallyDrop<T>,
    v: ManuallyDrop<String>,
}

X<[u8; 22]> unsizes to X<[u8]> with metadata 22.

what do you propose to do if the unsized field is of type dyn Trait? (Note that the vtable metadata field is currently marked nonnull.)

X<u8> unsizes to X<dyn Clone> with the metadata of u8 as Clone.

Accessing the wrong field is simply undefined behavior. E.g.

let x: X<u8> = X { v: ManuallyDrop::new(String::new()) };
let x: &X<dyn Clone> = &x;
unsafe {
    x.t.clone(); // UB
}

(The current rules for sized fields are a bit more relaxed but for unsized fields we can simply make up the new rule that accessing an "inactive" unsized field is unconditionally UB.)

RalfJung commented 3 years ago

Btw, since you mentioned that earlier -- note that unsizing-by-transmute is incorrect since the layout of wide pointers is not specified.

X<[u8; 22]> unsizes to X<[u8]> with metadata 22.

Oh, I see. That makes sense. Basically where struct with unsized field adds up all the static field sizes with the one of the unsized field, union takes a max instead. Unsizing will always be able to determine that size without looking at the data stored in the union because pre-unsizing, it is all present in the type. Initializing a X<[u8; 22]> requires specifying the 22 even if we initialize it in the String field -- this is the bit that I missed before.

This also works for dyn Trait.

When nesting union and struct, this can lead to more complicated code for size_of_val (a sequence of additions and max), but it is still possible to implement this in time linear to the depth of the type (so far it's constant, though). Maybe there is a more clever compliation strategy, but that's not something I am fundamentally worried about.

(The current rules for sized fields are a bit more relaxed but for unsized fields we can simply make up the new rule that accessing an "inactive" unsized field is unconditionally UB.)

There is no such thing as an "active" or "inactive" union field, and I think adding that notion would be a mistake. (It would also be quite complicated.) It also seems unnecessary -- your code is equivalent to reading x.t from the original sized x, i.e., it can still be explained by "union field accesses are transmutes".

The one issue that remains is the same as what I mentioned in https://github.com/rust-lang/rust/issues/80158: this proposal relies on determining the size of Box<T: !Sized> without accessing the pointee, which is in conflict with very general proposals for custom DST. That conflict could be resolvable in principle, but will lead to a lot of friction nevertheless.

mahkoh commented 3 years ago

rc::Weak already calls align_of_val_raw on invalid pointers.

RalfJung commented 3 years ago

rc::Weak already calls align_of_val_raw on invalid pointers.

The code is careful to spell out that it is valid only when the "unsized tail" of T is a slices or a trait object. AFAIK the public API of Rc makes it impossible to create other Rc<T> for unsized T, even if the respective unsizing coercions were added. Do you see a loophole in that approach?

mahkoh commented 3 years ago

The code I was thinking of is

https://play.rust-lang.org/?version=beta&mode=debug&edition=2018&gist=62b9754a432976a2646cf30812e60913

If I understand you correctly, you're saying that Weak<T> can never be safely converted to Weak where U is not sized but neither a slice or a trait.

The only other unsized types I'm aware of are extern types and I'm not aware of any way to unsize any type to an extern type. Therefore this statement would be trivially true in the current language and also hold for unions.

But even if there were special logic (maybe some marker trait?) that allowed Weak to coerce to slices or traits but not to other unsized types, could the same logic not also be applied to unions without extending the language?

RalfJung commented 3 years ago

That code is obviously wrong since you are using as_ptr on a dangling Weak.

But I didn't know/remember that Weak supports arbitrary unsizing coercions, I thought there'd just be a few select constructors. Cc @CAD97 who added unsized support for Weak (IIRC).

mahkoh commented 3 years ago

That code is obviously wrong since you are using as_ptr on a dangling Weak.

I should not have added that unsound unsafe block because now it's not clear if you meant that calling the safe function as_prt is wrong in itself or that dereferencing that pointer is wrong.

The following gist without that block is sufficient to demonstrate the issue: https://play.rust-lang.org/?version=beta&mode=debug&edition=2018&gist=be21e33e1d74345e13d90947147617c0

CAD97 commented 3 years ago

Cc @CAD97 who added unsized support for Weak (IIRC).

Yep, that me. Though to be clear, this was just as_ptr, into_raw, from_raw; Weak itself already did support ?Sized types.

rc::Weak already calls align_of_val_raw on invalid pointers.

This is... actually true, as you've demonstrated. I guess I need to reword the safety docs around that, and may be to blame for limitations on unsized values in the future 😅

The important part and what I argued at stabilization has to do with validity (as in validity invariant) of the pointer metadata. Even in the unsize coercion of a dangling Weak case, the fat pointer metadata is constructed by normal means.

This is the case even without my stabilization, though:

let rc: Rc<[u8; 32]> = Rc::new(Default::default());
let rc: Rc<[u8]> = rc;
let weak: Weak<[u8]> = Rc::downgrade(&rc);
drop(rc);
drop(weak);

Here we drop a Weak<[u8]> that, while not dangling, is pointing to dropped data. As such, even without being able to construct a truly dangling unsized Weak, it would be incorrect for it to read the contained data while drop/deallocating.

mahkoh commented 3 years ago

Here we drop a Weak<[u8]> that, while not dangling, is pointing to dropped data.

Great example. Of course, RcBox could in theory store the size and alignment of its data proactively which would make the metadata calls unnecessary. But then again so could Box.

RalfJung commented 3 years ago

The following gist without that block is sufficient to demonstrate the issue

Ah, because as_ptr internally calls align_of_val_raw? I see.

Here we drop a Weak<[u8]> that, while not dangling, is pointing to dropped data. As such, even without being able to construct a truly dangling unsized Weak, it would be incorrect for it to read the contained data while drop/deallocating.

There is a potential fix here by requiring that for custom DSTs, determining the size must be possible even after the drop glue ran (but before the underlying memory got deallocated). However, that still sounds quite awful and would seriously restrict which custom DSTs are possible.

Of course, RcBox could in theory store the size and alignment of its data proactively which would make the metadata calls unnecessary. But then again so could Box.

That is another interesting idea... custom DSTs could be required to provide a function that extracts all the data now such that, later, this data plus the ptr metadata is enough to reconstruct size and alignment. This extra data would be () for currently existing DST (since ptr metadata is enough) but thin trait objects you put size+alignment into that data (or just the vtable ptr), and RcBox/ArcBox could store that extra data.

However... Box isn't actually dropping its content before determining size+alignment, so there's no reason here why Box would have to do this. So something is still wrong here.

Wouldn't this mean ManuallyDrop::take becomes unsound, since it leaves behind a "moved-out-of" unsized thing of which the size still needs to be determinable?

RalfJung commented 3 years ago

To be a bit more concrete, the case I am thinking about is something like

union X<T> {
    t: ManuallyDrop<T>,
    v: ManuallyDrop<String>,
}
type Problem = X<ThinDst>;

ThinDst would do whatever it takes to become a custom DST, so for X<ThinDst> that support needs to be "lifted" to a union type -- which requires accessing x.t, since that's what ThinDst's custom DST support will require.

IOW, unions are not "structural" for custom DST support, whereas structs are.

The discussion above, as I understand it, is about figuring out to what extend ThinDst is already broken given what is currently stable.

mahkoh commented 3 years ago

The discussion above, as I understand it, is about figuring out to what extend ThinDst is already broken given what is currently stable.

Yes.

There is a potential fix here by requiring that for custom DSTs, determining the size must be possible even after the drop glue ran (but before the underlying memory got deallocated). However, that still sounds quite awful and would seriously restrict which custom DSTs are possible.

I do not think that is possible. https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f6532229a7706836246360fb1af18db3

Code like ZeroOnDrop already exists in the wild. If Display were replaced by a ThinDst, any size information it might want to access would already have been zeroed out at the end of f.

That is another interesting idea... custom DSTs could be required to provide a function that extracts all the data now such that, later, this data plus the ptr metadata is enough to reconstruct size and alignment. This extra data would be () for currently existing DST (since ptr metadata is enough) but thin trait objects you put size+alignment into that data (or just the vtable ptr), and RcBox/ArcBox could store that extra data.

I think there are two ideas here.

dyn Dst pointers store the extra data side-by-side with the pointer. This means that size+alignment can be determined without dereferencing the data pointer. Basically how it works today. But this would probably make thin DSTs impossible and dyn Dst pointers might be very large for some DSTs: data+vtable+size+alignment.
It's up to the container to store size+alignment if it needs to so that it can later be deallocated. The problem with this is that RcBox<T> is usually constructed with a sized <T>. To accommodate T unsizing to a custom DST, it would have to reserve an additional 16 bytes in its allocation to be able to store the size+alignment information. If it's not possible to determine at compile time if a type T might unsize to a custom DST, this would add a 16 byte overhead to all RcBoxs.

However... Box isn't actually dropping its content before determining size+alignment, so there's no reason here why Box would have to do this. So something is still wrong here.

By that comment I meant that, if we are willing to pay a blanked 16 byte overhead for all RcBox allocations only to support custom DSTs, then why not do the same with Box to support Box<Union<dyn Dst>>. Box then stores this additional data in its constructor.

Wouldn't this mean ManuallyDrop::take becomes unsound, since it leaves behind a "moved-out-of" unsized thing of which the size still needs to be determinable?

If size+alignment are always calculated by container constructors which work only with sized types by design, then there is no problem because the vtable is not needed for size calculations.

If size+alignment calculations use the "access after drop/before deallocate" idea you suggested, then ManuallyDrop::take should be fine because it leaves the memory unchanged as I understand it. But the ZeroOnDrop example breaks this either way.

RustyYato commented 3 years ago

@mahkoh

Code like ZeroOnDrop already exists in the wild. If Display were replaced by a ThinDst, any size information it might want to access would already have been zeroed out at the end of f.

Note: that code as written is unsound because zeroing references/other nonnull pointers is undefined behavior. The fix is to use MaybeUninit, which conveniently doesn't allow unsized types. So that doesn't apply to this discussion.

By that comment I meant that, if we are willing to pay a blanked 16 byte overhead for all RcBox allocations only to support custom DSTs, then why not do the same with Box to support Box<Union>. Box then stores this additional data in its constructor.

Because the Box isn't dropping dyn Dst, it's dropping Union. So it would be on the author of Union to correctly drop dyn Dst.

mahkoh commented 3 years ago

Note: that code as written is unsound because zeroing references/other nonnull pointers is undefined behavior.

I disagree. Invalidating the state of ManuallyDrop is a supported operation as long as the ManuallyDrop is not accessed afterwards. See ManuallyDrop::drop. Overwriting it is no different since it's not accessed afterwards.

Because the Box isn't dropping dyn Dst, it's dropping Union. So it would be on the author of Union to correctly drop dyn Dst.

It's not about dropping the contents. It's about calculating the size and alignment of the contents.

RustyYato commented 3 years ago

I disagree. Invalidating the state of ManuallyDrop is a supported operation as long as the ManuallyDrop is not accessed afterwards. See ManuallyDrop::drop. Overwriting it is no different since it's not accessed afterwards.

From the ManuallyDrop docs

ManuallyDrop is subject to the same layout optimizations as T. As a consequence, it has no effect on the assumptions that the compiler makes about its contents.

It's unsound to zero &T, so it's unsound to zero ManuallyDrop<&T>

Also see: https://github.com/rust-lang/unsafe-code-guidelines/issues/245

It's not about dropping the contents. It's about calculating the size and alignment of the contents.

Right, I misunderstood

burdges commented 3 years ago

Is there some good reason for T -> dyn Trait conversions with T: !Sized?

I think there are two ideas here. ..

It's clear both 1 and 2 are non-zero cost for almost all use cases, for what amounts to an extraordinarily niche feature.

Instead metadata should be specified explicitly by pointer types, so either make traits specify their metadata perhaps via trait Foo : Pointee<Meta = MyMeta> {} using #2984 syntax, or else specify in the dyn type so dyn<MyMeta> Trait or dyn Trait+Pointee<Meta = MyMeta> again using #2984 syntax.

In this, one could not ever convert a &dyn<MyMeta> Trait to a &dyn Trait, but perhaps the reverse makes sense sometimes.

As an aside, if one wants more complex metadata then really types like struct Foo([X],[Y]); look much more useful than dyn whatever, meaning &Foo requires one data pointer and two lengths.

mahkoh commented 3 years ago

It's unsound to zero &T, so it's unsound to zero ManuallyDrop<&T>

Also see: rust-lang/unsafe-code-guidelines#245

Yes it looks like I was wrong. I'll see if I can come up with another example.

RalfJung commented 3 years ago

By that comment I meant that, if we are willing to pay a blanked 16 byte overhead for all RcBox allocations only to support custom DSTs, then why not do the same with Box to support Box<Union>. Box then stores this additional data in its constructor.

I understand. What I am trying to figure out is which protocol Box would be breaking if it didn't do that. Clearly, a Box that doesn't do this is wrong under Custom-DST + Unsized-Unions. But which safety condition is being violated where?

mahkoh commented 3 years ago

But which safety condition is being violated where?

That particular statement was not intended to show that there is already a problem with DSTs. (I think that's what you mean by your question which I didn't understand the first time.)

CAD97 commented 3 years ago

Noting this here before I forget: even without my additional as_ptr/into_raw/from_raw support for Weak<T: ?Sized>, it was already possible to create and drop the last dangling Weak because of unsize coercion:

pub fn main() {
    let weak: Weak<[u8]> = Weak::<[u8; 32]>::new();
    drop(weak); // <-- HERE
}

I think this means that my stabilization didn't add any new requirements to the impl, as they already existed in Weak's drop implementation.

EDIT: wait no dangling weak don't have to deallocate anything what was I thinking whoops

CAD97 commented 3 years ago

See https://github.com/rust-lang/rust/pull/80407 for follow-up on the dangling unsized Weak issue. With cleverer implementations for our refcounted pointers, they don't require align_of_val_raw at all, so shouldn't be used as rationalization for "it's already possible." See the linked issue for more context and/or further discussion.

RalfJung commented 3 years ago

That particular statement was not intended to show that there is already a problem with DSTs. (I think that's what you mean by your question which I didn't understand the first time.)

Right, but in a world with your unsized unions and custom thin DST, Box is unsound. So where's the bug, where's the unsoundness coming from, which condition is being violated where?

mahkoh commented 3 years ago

I'm sorry but it looks like I still don't understand your question. Are you asking where the unsoundness with Box+thin Dst+unsized unions is? You've already spelled it out in this issue. Are you asking where an existing unsoundness with Box+thin Dst is? Like I said, this particular sentence had nothing to with showing that there is existing unsoundness. That sentence was about showing a way in which Box+thin Dst+unsized unions could be made sound.

RalfJung commented 3 years ago

Are you asking where the unsoundness with Box+thin Dst+unsized unions is?

Yes.

You've already spelled it out in this issue.

I've spelled out what goes wrong, operationally. But I'd like to understand better what happens. Since something goes wrong, clearly a proof of soundness for this system will fail. What I haven't understood yet is where it will fail.

It's one thing to say "having Box store size and alignment up-front fixes this", but that still doesn't say which contract would be violated by not doing so.

So here's my rambling thoughts on this, typing them out pretty much as they come to my mind:

Box only calls size_of_val on live values. I can't see any way that Box is violating the preconditions of anything here. Thin DST themselves are also fine, I think. So the real culprit, then, are unsized unions.

Unsized unions essentially mean we have to remove size_of_val and align_of_val from the language -- they become unusable. Literally the only thing Box is doing wrong here is calling these functions. So really the unsoundness is in having size_of_val + thin DST + unsized unions.

So if we were to add unsized unions, then we'd have to accept that size_of_val will not be callable in situations where there might be a thin DST. Notwithstanding the align_of_val_raw issue in Rc that you uncovered, this seems like a big leap.

mahkoh commented 3 years ago

You make a compelling point.

We want

Have unsized unions at least for those DSTs that can determine their layout without accessing the data pointer.
Not make implementing thin DSTs harder.
Not have size_of_val(t) panic just because t contains a union with a thin DST field.
Not add any new type bounds that mean "this is a DST whose layout can be determined without accessing the data pointer."

What about stopping the problem at Unsize?

/// - `[T; N]` is `Unsize<[T]>`
/// - `T` is `Unsize<dyn Trait>` when `T: Trait`
/// - `Foo<..., T, ...>` is `Unsize<Foo<..., U, ...>>` if:
///   - `T: Unsize<U>`
///   - Foo is a struct
///   - Only the last field of `Foo` has a type involving `T`
///   - `T` is not part of the type of any other fields
///   - `Bar<T>: Unsize<Bar<U>>`, if the last field of `Foo` has type `Bar<T>`

All implementations of Unsize are provided by the compiler and, of course, there are currently no implementations specified for unions. We could have the following rules for unions:

Foo<..., T, ...> is Unsize<Foo<..., U, ...>> if:
- T: Unsize
- Foo is a union
- At most one variant of Foo has a type involving T
- T is not part of the type of any other variant
- Bar<T>: Unsize<Bar>, if the variant of Foo involving T has type Bar<T>
- Determining the size_of_val/align_of_val of &Bar does not require accessing the data pointer of &Bar

The last point being the difference to struct. This property holds for traits and slices and custom DSTs could opt into this property by implementing some unsafe trait.

Note that we would allow

union U<T: ?Sized> {
    t: ManuallyDrop<T>,
    u: (),
}

to compile even though the behavior would be undefined if you could call size_of_val on a &U<ThinDst>. However, there would be no safe way to construct such a U because U<T> does not implement Unsize<U<ThinDst>>.

RalfJung commented 3 years ago

So basically, "fat DST" would work in unions but thin DST would not, by controlling which unsizing coercions the compiler permits?

Yeah, something like this could work. There might be semver concerns here since now it matters whether an unsized type parameter is used below a union or not -- but maybe that is acceptable.

rust-lang / rfcs

Unions with unsized variants #3041