rust-lang / libs-team

The home of the library team
Apache License 2.0
123 stars 19 forks source link

Add `get_unchecked` to `Option` and `Result` #378

Closed alion02 closed 4 months ago

alion02 commented 5 months ago

Proposal

Problem statement

Prompted by this comment:

Similarly, you can already write:

unsafe { a.checked_mul(b).unwrap_unchecked() }

so it's strange to me that additional functions like unchecked_mul are being added to u32 for this specific operation.

I began searching for concrete examples of situations where using unwrap_unchecked leads to bad codegen. I didn't have to search long.

The problem with unwrap_unchecked is that it's the programming equivalent of littering - with every unwrap we're sprinkling in a condition for LLVM to keep around, even though frequently what we really mean is "I know this is Some/Ok/Err, please give me the contents without checking the variant." In a perfect world, these conditions would just get ignored when irrelevant and cleanly DCE'd, but...

Motivating examples or use cases

type NZ = core::num::NonZeroU32;
type U = u32;

#[no_mangle]
unsafe fn sum_unwrap(a: &[Option<NZ>; 16]) -> U {
    a.iter().map(|&v| v.unwrap_unchecked().get()).sum()
}

godbolt

This innocuous snippet leads to staggeringly bad assembly - 80+ lines of LLVM IR and 30+ ARM instructions... to add together 16 numbers. For reference, a non-vectorized implementation would be 15 adds, 8 pair-loads, and a return. Autovectorized it's just 8 instructions.

Solution sketch

Maybe we should just stop littering.

We can restore good codegen if we express the unwrap in a different way, without invoking unreachable_unchecked; for example, like this:

trait OptionExt {
    type T;
    unsafe fn get_unchecked(self) -> Self::T;
}

impl<T> OptionExt for Option<T> {
    type T = T;

    #[allow(invalid_value)]
    unsafe fn get_unchecked(self) -> T {
        match self {
            Some(v) => v,
            None => core::mem::MaybeUninit::uninit().assume_init(),
        }
    }
}

godbolt

Add the above method (and analogous methods on Result) to core.

Alternatives

Change implementation of unwrap_unchecked

Idly browsing through uses of unwrap_unchecked, I notice that a significant portion (perhaps even majority!) of them probably don't care to keep their conditions around. Worth investigating with benchmarks.

Not convinced it's relevant, but clang does not generate an assume for an std::optional dereference. godbolt

Additionally, the "unchecked" wording sort of implies a lack of checks, which is... well, ostensibly true...

Change current implementation and add new methods

Assuming get_unchecked is on average better than unwrap_unchecked, we might want to replace the functionality for current code and also keep providing the previous functionality for the cases where it is useful. Call it unwrap_assume or something.

Do nothing

This can be implemented in user code just fine, as an extension method. The problem is discoverability - if you're reaching for unwrap_unchecked, you probably care about performance, and with unwrap_unchecked being the only unchecked method on Option/Result you might not think to search further, or consider what the method does under the hood (and whether that's something you want to happen).

Improve LLVM

Presumably a long-term effort. I don't have the necessary knowledge to properly consider this alternative.

Noratrieb commented 5 months ago

this is just a duplicate of unwrap_unchecked, there's no value in having both

workingjubilee commented 5 months ago

Additionally, the "unchecked" wording sort of implies a lack of checks, which is... well, ostensibly true...

The documentation for Option::unwrap_unchecked says:

Returns the contained Some value, consuming the self value, without checking that the value is not None.

So we do not promise the implementation of Option::unwrap_unchecked is specifically None => unreachable_unchecked().

kennytm commented 5 months ago

You'll get the same optimized ASM using safe functions alone

#[no_mangle]
fn sum_map_or(a: &[Option<NZ>; 16]) -> U {
    a.iter().map(|&v| v.map_or(0, NZ::get)).sum()
}

This works because Option<NonZeroU32> and u32 are layout-wise equivalent and v.map_or(0, NZ::get) must be an identity function in the low-level representation. The special layout of NonZeroU32 is likely also why uninit().assume_init() works while unreachable_unchecked() does not, until LLVM realized such optimization opportunity.

scottmcm commented 4 months ago

Have you filed an LLVM bug? Seems worth seeing what they say, even if it's just "yes, this is why assume operand bundles are better than assume calls with values".

This innocuous snippet leads to staggeringly bad assembly - 80+ lines of LLVM IR

Most of which is the assumes, which are not emitted into the machine code. Counting those is disingenuous.

As Nils said, the proposed method is just unwrap_unchecked, and thus isn't worth adding. (Notably, returning undef in LLVM from a function marked noundef on the return is just another way of spelling unreachable now that https://github.com/llvm/llvm-project/issues/60717 got fixed.) Yes, the icmp-assumes aren't great, but range metadata on parameters in LLVM is the way forward for that, to stop needing the assumes. (If you remove them entirely you'll quickly find things like .get() > 0 no longer optimizing, see https://github.com/rust-lang/rust/issues/49572 for some history there.)

That said, I do think there's a place for something here. NonZero::new is the safe code way to do the guaranteed-legal-transmute from u32 to Option<NonZero<u32>>. It would make sense to me to have a safe method for Option<NonZero<u32>>u32 as well.

Then you wouldn't need unsafe code for this at all, since

pub fn sum_unwrap(a: &[Option<NZ>; 16]) -> U {
    a.iter().map(|&v| v.unwrap_or_zero()).sum()
}

would give exactly (https://godbolt.org/z/e5axT9bK8) what you wanted.

pitaj commented 4 months ago

v.unwrap_or_zero()

That's just unwrap_or_default right?

scottmcm commented 4 months ago

That's just unwrap_or_default right?

No, as in @kennytm 's snippit above it's v.map_or(0, NZ::get). Just unwrap would give the NonZero, which isn't Default.

So yes, there's a safe way, but it's a less-direct way. I like it when we have a clear "this is just the transmute you're about to write in unsafe code" method for things where the transmute would be safe given stable guarantees, because it's easy to link from next to the documentation of the layout guarantee and avoids all the "oh, I didn't think of that" or "but that's so much slower in debug mode" or whatever objections.

dtolnay commented 4 months ago

Thank you for the ACP, discussion, godbolt links, and draft PRs for benchmarking!

We discussed this ACP in this week's standard library API team meeting. Those present were not convinced that having 2 unsafe Option<T>->T conversions, one using intrinsics::unreachable() and the other MaybeUninit::uninit().assume_init() or something else, was worth having at this time.

But instead we are open to whatever improvements can be made to LLVM, rustc, or the standard library to make the existing unwrap_unchecked lead to better code on average.

We didn't get a chance to evaluate Option<NonZero<_>>::unwrap_or_zero (https://github.com/rust-lang/libs-team/issues/378#issuecomment-2098921092, https://github.com/rust-lang/libs-team/issues/378#issuecomment-2099002444) as a team, but that would be a separate ACP. My personal opinion is that option.map_or(0, NonZero::get) is good enough for this, but it wouldn't surprise me if others on the team feel inversely.