rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.9k stars 1.56k forks source link

More Exotic Enum Layout Optimizations #1230

Closed Gankra closed 6 years ago

Gankra commented 9 years ago

There are several things we currently don't do that we could. It's not clear that these would have practical effects for any real program (or wouldn't negatively affect real programs by making llvm super confused), so all I can say is that these would be super cool.

Use Undefined Values in Other Primitives

(&u8, &u8) can be the same size as Option<Option<(&u8, &u8)>>

Support More than a Single Bit

(&u8, u64) supports 2^64 variants via the encoding (0, x)

Use the Fact that Void is Statically Unreachable

enum Void { } cannot be instantiated, so any variant the contains Void is statically unreachable, and therefore can be removed. So Option<Void> can only be None, and therefore can be zero-sized.

Support Enums that have More than One Non-C-Like Variant

enum Foo {
  A(&u8, &u8),
  B(&u8),
}

Can be represented without any tag as (&u8, *const u8) via the following packing:

Option<Option<u32>> can be "only" 8 bytes if the second Option stores its discriminant in the first Option's u32 tag.

ahicks92 commented 7 years ago

Yeah, you're right. My bad. I still think my point stands, though.

eddyb commented 6 years ago

So in https://github.com/rust-lang/rust/pull/45225#issuecomment-336474332 I mentioned some hard cases, but @trentj on IRC brought it up again and I've realized there's a simpler way to look at things, that is not type-based, as

enum E {
    A(ALeft..., Niche, ARight...),
    B(B...),
    C(C...)
}

can be seen as a generalization of the case where sizeof B and sizeof C are 0.

The challenge is to fit B and C in ALeft and/or ARight, which doesn't have to be too clever if Niche has either the smallest (e.g. bool, Option<ZST>, etc.) or largest (e.g. &T, Option<usize>, etc.) alignment in E, because we can hint the field reordering to place it at one end of the variant, giving B and C only one run of bytes to fit, instead of two.

EDIT: now over at https://github.com/rust-lang/rust/issues/46213.

fstirlitz commented 6 years ago

One possible optimisation that hasn't been mentioned yet: NaN tagging. It's impossible to implement with current Rust types, however; floating-point types treat all possible NaN bit patterns as valid. Doing this would require adding NonZero-like wrapper types or some equivalent mechanism to restrict valid representations of floating-point values.

I've actually got something written up on this topic, I might submit it as an RFC soon...

eddyb commented 6 years ago

@fstirlitz With const generics we could have a generalization of NonZero and use it to implement a type like NonNaN, by removing a specific range of values for f32 and another for f64.

fstirlitz commented 6 years ago

@eddyb Will it be possible to use const generics to disallow NaNs with certain payloads, but not others? You'd have to have to_bits() be const fn, which is currently impossible, because it's implemented with transmute. (Use case for this: an interpreter for a dynamically-typed language which supports NaNs, but doesn't expose payloads.)

And even if sufficiently expressive const generics do arrive, it will be beneficial to have a canonical form of this feature in the standard library.

eddyb commented 6 years ago

I'm not talking about CTFE to compute the layout, but a wrapper type with two integer parameters expressing a range of values that the inner type can't use but optimizations can.

fstirlitz commented 6 years ago

You mean, like...

// possibly wrapped in an opaque struct;
// could probably be generic over float types
// by means of associated consts and types,
// but using f64 for readability
enum NaNaN64 {
    Positive(IntInRange<u64, 0x0000000000000000, 0x7ff0000000000000>),
    Negative(IntInRange<u64, 0x8000000000000000, 0xfff0000000000000>),
}

impl From<NaNaN64> for f64 { /* f64::from_bits */ }
/* other impls */

That's... not especially convenient, is it?

Plus, without compiler support it won't be able to additionally take advantage of LLVM's fast-math annotations.

eddyb commented 6 years ago

More like:

struct NaNaN64 {
    float: WithInvalidRange<WithInvalidRange<f64,
        0x7ff0000000000000, 0x7fffffffffffffff>,
        0xfff0000000000000, 0xffffffffffffffff>
}

But internally we can't represent the two ranges anyway for now, so for NaN-boxing you'd have to choose only one of them, in the near future. As for fast-math... It could work if LLVM actually understood range metadata as "can't possibly be a NaN" and optimizing based on that.

It's much more straightforward when everything is offsets and bitpatterns and ranges than "types".

Gankra commented 6 years ago

I'm pretty sure that a safe "forbidden NaN" float type would undermine its own benefits with NaN-masking-cmov's (or worse) everywhere.

Yoric commented 6 years ago

Note: If someone is willing to mentor me on this bug, I'm interested in tackling it.

eddyb commented 6 years ago

@Gankro How are we going to track that most of the optimizations mentioned this have been implemented, and the various tricks required to make any further changes?

Gankra commented 6 years ago

@eddyb are any of the optimizations in the OP not implemented? It looks like at least most are, and if so we might want to turn this into a metabug that points at sub-issues for the remainder, or just close it.

fstirlitz commented 6 years ago

@Gankro: re NaNs, you mean checking after arithmetic operations whether the result was NaN? Maybe. On the other hand, such types could at least implement Ord and Eq; and a 'designated canonical NaN' type might even elide most of these checks (if the canonical NaN is chosen wisely), thanks to NaN propagation semantics of IEEE 754.

SimonSapin commented 6 years ago

@eddyb Could Cow<str> be represented on three words? Cow::Owned(String) would be String (with a non-zero pointer, assuming it is first) and Cow::Borrowed(&str) would be (0, &str). I half expected https://github.com/rust-lang/rust/pull/45225 to do this but it doesn’t.

eddyb commented 6 years ago

Yeah, I only figured out how to later. Also, it requires rustc_trans to generate LLVM constants from miri allocations, for us to be able to combine the niche with additional data.

Kixunil commented 6 years ago

@SimonSapin another alternative: since capacity > len, &str can be represented exactly as String with capacity == 0. The con is we can't distinguish between zero-sized &str or zero-sized String. The great thing is for all practical purposes we don't have to. My suggestion also keeps pointer non-zero, so even Option<MyCow> has the same layout.

I've actually started writing such crate for fun.

Same holds for Vec

SimonSapin commented 6 years ago

But the compiler doesn’t know about capacity > len. It does already know that NonZero<*mut u8> inside String is non-null, though.

Kixunil commented 6 years ago

Yes. The compiler will probably never be able to do the optimization using capacity >= len because of the weird empty case, which we know doesn't matter but explaining that to the compiler would be a nightmare.

eddyb commented 6 years ago

Since a part of the original desired optimizations have been implemented in rust-lang/rust#45225, and most of the remaining ones have various trade-offs that need to be explored by RFC for each category separately (with the exception of https://github.com/rust-lang/rust/issues/46213), I'm going to close this central issue.