Implement likely/unlikely intrinsic (tracking issue for RFC 1131)

nikomatsakis commented 9 years ago

Tracking issue for rust-lang/rfcs#1131:

[x] Implement the likely/unlikely intrinsic
[ ] Re-export in std::hint
[ ] Document
[ ] Stabilize

pythonesque commented 9 years ago

FWIW, looking through my own code, I actually think the fact that this can't apply to match variants is going to be fairly hard on idiomatic Rust (for example, it means you can't use it when matching on an Option, which is going to be extremely common since it amounts to if (likely(ptr != NULL)) in C).

Aatch commented 9 years ago

@pythonesque once I get this done, I plan on writing a follow-up RFC for a #[likely] attribute you can put on match arms. Since it's already possible to put attributes on arms (to support conditional compilation of them), it should be fairly easy to implement.

Craig-Macomber commented 9 years ago

It might be worth having an "unpredictable" option in addition to likely and unlikely. There is now something to lower it to in LLVM: http://reviews.llvm.org/D12341

apasel422 commented 9 years ago

@Aatch Are you still working on this?

vi commented 8 years ago

Shall #[likely] be also on Result?

enum Result<T, E> {
   #[likely]
   Ok(T),
   Err(E)
}

I expect for most code Ok is likely and Err is unlikely. Of course, annotating if or match should override enum variant's annotation.

Aatch commented 8 years ago

@vi I think hinting should be only be used rarely, so it being on the type doesn't seem like a good idea. Result is a bit more general purpose than just error-handling, sot that kind of hinting would impact code where it's inappropriate. Hinting should only really be done at the site itself.

vi commented 8 years ago

@Aatch, Is it a good idea to have likely in every place where something is likely? Can it also serve for self-documentation: follow "likelies" and see happy path?

Aatch commented 8 years ago

@vi is is not. As a general rule, people are pretty bad at knowing what cases are actually likely or not. Branch hinting is a fairly specific optimisation hint, used incorrectly it can actually make code slower. It should also only really be used when the unlikely case is known to be very unlikely.

Basically, you should only use it if it makes your code faster, otherwise it should be there.

pczarn commented 8 years ago

@vi It might be good idea if the likely thing is cheap, and the unlikely thing is expensive. For example, Vec::push is expensive if it causes a reallocation, and very cheap otherwise.

In my opinion, frequent use of likely is either unnecessary or just manual profile-guided optimization.

Stebalien commented 8 years ago

Instead of arguing about it, why not just test it? Annotate Result, and then see if rustc bootstraps significantly faster. This won't give a definitive answer but will help determine if it's worth considering.

comex commented 8 years ago

Now that MIR is a thing, it would be nice if this were implemented.

nikomatsakis commented 8 years ago

@comex I do agree =)

@Aatch Not sure what your time is like, but if you don't have time to implement this, it seems like a case where we might be able to mentor someone. The idea would be to leave a comment explaining roughly how you think it should work, and then tagging the issue with E-mentor and E-help-wanted. I of course can do this too and will try to remember to do so. =)

nikomatsakis commented 8 years ago

Unassigning @Aatch since, empirically, he doesn't seem to be actively hacking on this right now. =) (Feel free to correct me.)

seanmonstar commented 8 years ago

@nikomatsakis PR up, though I could use some help adding tests.

Aatch commented 8 years ago

Copying @nagisa's comment from #36181:

llvm told me expect intrinsic is more harmful than it is helpful and that the branch weight metadata should be emitted instead. Might be worth to go back and reconsider the RFC, since the branch weight metadata is more powerful and expressive and would be better expressed as attributes in source too.

why is clang > 3.6 not emiting expect intrinsic for __builtin_expect? nagisa: it turns out that the expect intrinsic overall harmed the optimizer by inhibiting certain optimizations so we now directly lower __builtin_expect to branch probabilities in the frontend when generating IR

I think that keeping the intrinsics is a good idea. They should be fairly obvious for people coming from C/C++. Attributes aren't mutually exclusive with the intrinsics, so we're not exactly cutting ourselves off with intrinsics.

Supporting branch weights at the MIR level would probably be the easiest implementation for lowering to the metadata ourselves.

chriskrycho commented 7 years ago

Just adding a note here as I'm processing everything in #38643: it will need docs before stabilization. @nikomatsakis would you mind adding a note to that effect in the root of the issue?

nikomatsakis commented 7 years ago

@chriskrycho done

bluss commented 7 years ago

Will these intrinsics be stabilized as they are? Under which path?

seanmonstar commented 7 years ago

Every few months, because I've forgotten the results, I try these out in httparse, on branches I'm absolutely certain are always taken, and see no gain, with a wild range in variance when running the becnhmarks.

I'm not adept at all with either assembly or llvm-ir when compiling with optimizations to tell how they are implemented.

There's older comments that the expect intrinsic in LLVM is harmful, that it should be done with branch weights instead. But when I go diving in LLVMs source now, it appears there is a pass that already lowers expects into branch weights anyways?

nagisa commented 7 years ago

The reason they're harmful is because they're still very opaque calls all the way till they're lowered, inhibiting all the early optimisations which are not specifically handling these intrinsics. Just don't use them.

On Oct 21, 2017 20:30, "Sean McArthur" notifications@github.com wrote:

Every few months, because I've forgotten the results, I try these out in httparse, on branches I'm absolutely certain are always taken, and see no gain, with a wild range in variance when running the becnhmarks.

I'm not adept at all with either assembly or llvm-ir when compiling with optimizations to tell how they are implemented.

There's older comments that the expect intrinsic in LLVM is harmful, that it should be done with branch weights instead. But when I go diving in LLVMs source now, it appears there is a pass that already lowers expects into branch weights anyways?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/26179#issuecomment-338419031, or mute the thread https://github.com/notifications/unsubscribe-auth/AApc0vcOu7V2ntyt5JrLHlIuaG1coDmvks5suipBgaJpZM4E-jMn .

hsivonen commented 6 years ago

The reason they're harmful is because they're still very opaque calls all the way till they're lowered, inhibiting all the early optimisations which are not specifically handling these intrinsics. Just don't use them.

An earlier comment says (if I'm reading it right) that clang deals with the corresponding C intrinsic by generating branch probabilities in the IR instead of IR intrinsic.

Are there technical reasons against doing the same in rustc?

nagisa commented 6 years ago

No reasons against generating branch probabilities in rustc, however it is not as trivial to implement it that way (i.e. go from a function call to probabilities in IR)

One could as well just go ahead and add the attributes that are more general and easier to implement.

nikomatsakis commented 6 years ago

I have to say, I am personally inclined to just remove this intrinsic. The results seem to be uneven at best and I feel no strong urgency to jump on this.

ArtemGr commented 6 years ago

The results seem to be uneven at best

You mean profile-guided optimization is a hoax?

GabrielMajeri commented 6 years ago

According to the Intel optimization manual, the order in which branches are laid out in assembly does matter (likely code should come first, unlikely comes after).

However, the question is if LLVM isn't already smart enough to reorder stuff by itself. These intrinsics were added a long time ago, when compilers were bad at predicting stuff.

From what I've seen, if you use an unwrap or check for an error code (e.g. right before return Err), the compiler is smart enough to optimize the code and detect the more likely branch.

Even the Linux kernel has an option to allow the compiler to un-inline certain functions because newer versions of GCC are much better at inlining. Same thing could happen with likely / unlikely: if the compiler is smart enough, there is no need for the intrinsic.

TL; DR: these intrinsics are useful for people who love premature optimization, believing they can outsmart the compiler.

As a workaround, you can simply write your code such that the generated assembly is laid out as you want.

Or use PGO. A quote from the GCC manual:

In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform.

vi commented 6 years ago

TL; DR: these intrinsics are useful for people who love premature optimization, believing they can outsmart the compiler.

Annotating the code with likely/unlikely can also help readability. What is marked likely for the compiler is also interpreted as likely by the reader, who can follow the "main path" in the code more easily.

briansmith commented 6 years ago

@pythonesque once I get this done, I plan on writing a follow-up RFC for a #[likely] attribute you can put on match arms. Since it's already possible to put attributes on arms (to support conditional compilation of them), it should be fairly easy to implement.

We could perhaps just allow #[cold] and/or #[inline(never)] on blocks/statements instead of just functions:

let x = if condition {
    ....
} else #[cold] {
   ....
};

let y = match condition2 {
    Option(x) => x,
    None => #[cold] 1
};

Regarding likely/unlikely specifically, I think people are expecting that they'd only be used inside if expressions, but I don't see that restriction in the RFC.

Anyway, in lots of the code I write, regarding parsing and cryptography, we really do always want the "no problems" path to be fast. Not only do we generally not care about the performance of the unlikely "oops, that's not how it's supposed to be" path, but we in fact would like to have the compiler actively de-optimize that path, moving it far away from the fast path, to improve instruction cache locality.

Also, PGO sounds nice but for many applications, like most of mine, it would be a huge waste. Its hard enough to get good code coverage; it's nearly impossible to develop PGO-focused coverage that emphasizes the code paths for which one most cares about performance.

hsivonen commented 6 years ago

TL; DR: these intrinsics are useful for people who love premature optimization, believing they can outsmart the compiler.

Possibly. And possibly these don't generate the best LLVM IR for the purpose given how LLVM consumes IR today.

However, I think it's not productive to suggest that the compiler always outsmarts the programmer and, therefore, there's no need to provide detailed control as an option. I think the appropriate use for intrinsics like these is being able to persist manual PGO (iterating cargo bench with different branch prediction hints) in the source code without requiring the user of the crate to set up an appropriate benchmark for compile-time profiling.

Last week, I had a match with consecutive (both in source order and in numerical order of the condition) identical arms and combining the identical match arms destroyed about 100 MB/s of throughput (from 9xx MB/s to 8xx MB/s), because the compiler put the checks in a different order. By rewriting the code as manually-ordered ifs, I was able to do better than the original match.

briansmith commented 6 years ago

Even in the standard library, there are quite a few places where these kinds of improvements to optimization hints would be useful. We shouldn't need to factor out code into a separate function just to mark it unlikely/cold like in https://github.com/rust-lang/rust/commit/257bff3192e2c7313a4f8cfcac8839a573b42f6b#diff-5da4ecf5ad221f68bde0ae13f2a7c22dR700, for example.

Also, in my code, I generally want the compiler to optimize, to an extreme extent, likely() paths for speed and cache locality. Similarly, I want it to optimize, to an extreme extent, unlikely()/#[cold] paths for size and out-of-the-way cache locality. Also, like Henri said, sometimes predictable performance is very important and sometimes the profile-guided optimization has been done by hand to ensure such predictable performance. Hints like likely()/unlikely()/#[cold] are the only way we have to transfer the knowledge from the profiling phase to the codegen phase in those situations.

nikomatsakis commented 6 years ago

Has anyone successfully used these intrinsics to achieve speedups?

I prefer the #[cold] formulation, myself.

I didn't mean to suggest that there is no role for user-hinting, but merely that this particular formulation didn't seem like a big success. The LLVM developers (from what I understand) don't recommend using these instrinsics, and they seem to be usable in confusing ways (e.g., outside of an if).

ArtemGr commented 6 years ago

Has anyone successfully used these intrinsics to achieve speedups?

@nikomatsakis, in the way they're implemented the intrinsics aren't very useful, but this can be improved by "generating branch probabilities in the IR", like clang does.

Using likely and unlikely (__builtin_expect) we can achieve the speedups in C (cf. https://stackoverflow.com/questions/109710/likely-unlikely-macros-in-the-linux-kernel-how-do-they-work-whats-their), so why shouldn't we be able to do this in Rust, given that the intrinsics are already there? Why not simply fix them to be useful?

I prefer the #[cold] formulation, myself.

Do you mean #[cold] on block statements?

joshlf commented 6 years ago

On the subject of whether they're useful: I've used them myself (though I've not benchmarked enough to tell whether they made a difference). One thing I did notice was that rustc doesn't appear to be able to do constant propagation through them because I had a call to unlikely(<expr>) where <expr> could easily be constant-propagated to false at compile-time, but the corresponding if branch was still included in the compiled output.

hsivonen commented 6 years ago

Has anyone successfully used these intrinsics to achieve speedups?

On most of my attempts, not, but that might have been due to wrapping them in inlines, which according to another issue doesn't work.

Today, however, I finally found a case where I saw a notable boost. If I add unlikely to this if, I get a 33% performance improvement on Russian plain text. (In case anyone thinks that the function as a whole is an unnecessary optimization, it's notably faster than iterating over a str by char and calling is_char_bidi() for each one.)

Also: Why are likely() and unlikely() currently unsafe? As far as their non-performance semantics go, they just return their argument, which is safe, and modifying performance is safe, too.

nikomatsakis commented 6 years ago

On most of my attempts, not, but that might have been due to wrapping them in inlines, which according to another issue doesn't work. .. Also: Why are likely() and unlikely() currently unsafe? As far as their non-performance semantics go, they just return their argument, which is safe, and modifying performance is safe, too.

Not good reasons. I think it's because of the (imo) rather artificial distinction between "intrinsics" and "lang items" in the compiler. I've been meaning forever to write an RFC collapsing the distinction -- basically just having "something well known to the compiler". But for whatever reason we have a distinction now, and intrinsic functions are (a) always unsafe and (b) never exposed directly (hence the inlining).

Why not simply fix them to be useful?

My feeling was that since they are not working that well now (and are not in any particular use) and the because the interface is Not Great, maybe we should rip it out and try something better.

The reasons that I think the interface is not great are roughly that a function doesn't seem like the optimal choice here:

Apparently we have to do some mapping to something else for LLVM to make good use of them
What is the meaning of let x = likely(...); if x { .. .}? Is that the same as if likely(...) { ... }?
- Tracking where the call to likely occurred is kind of painful around MIR too
- Why use an "identity function" (which can be invoked from many places) when we are really trying to indicate what code is cold and which is not?

etc

So, in short, I don't have a strong opinion, just looking for opportunities to cull underutilized and not especially succesful language features and replace with something better.

ArtemGr commented 6 years ago

My feeling was that since they are not working that well now (and are not in any particular use) and the because the interface is Not Great, maybe we should rip it out and try something better.

Good point, @nikomatsakis !

It might take some time though for a different branch hinting approach to come into the language.

I think that in Rust we have a strong precedence of having practical, but Not Great, solutions to be implemented in the language in the unstable form. It allows us to have our pie and eat it too.

As of now Rust might be at some disadvantage over C/C++ in regards of branch hinting. There are existing algorithms and programs that have been optimized in C/C++ to use branch hinting efficiently, and currently it might be problematic to port them to Rust while maintaining the same level of performance, because we lack a branch hinting solution of a quality comparable to GCC and CLANG implementations.

Implementing the likely/unlikely intrinsic as an unstable feature, till we have a better one, might be good for language adoption. It might also help those members of the community who want to port their code to a safer language but are hindered by performance constraints.

I think that a lot of us flock to Rust because it offers performance. It's one of the trifecta on the https://www.rust-lang.org/ page. Personally, I started to believe in Rust as a performance language when I've seen the likely/unlikely RFC merged. It pains me to see that Rust still isn't there.

Can I do something to convince you that we need something working until we have something better?

bluss commented 6 years ago

@ArtemGr You already have it as unstable feature: std::intrinsics::unlikely. Using it and evaluating the current implementation is already possible. As you can see the discussion here is already in relation to the existing implementation and its flaws.

ArtemGr commented 6 years ago

You already have it as unstable feature: std::intrinsics::unlikely. Using it and evaluating the current implementation is already possible.

@bluss , it's not currently on par with CLANG / GCC implementations.

As you can see the discussion here is already in relation to the existing implementation and its flaws.

That's right. We're discussing whether to remove or to improve this feature.

gnzlbg commented 6 years ago

From P0479r2: Attributes for Likely and Unlikely Statements (Revision 2):

It seems that likely/unlikely are particularly widely used in systems programming (linux kernel and web browsers: Firefox and Chrome).
the optimization hints can be used to improve/degrade performance up to 40% on synthetic examples (two examples are studied), and to a much lesser extent on real projects (two open-source projects are studied) by generating cmp/jmp for predictable branches instead of cmov instructions in x86. Worth mentioning is that some architectures, like PowerPC, have instructions to override dynamic branch prediction, and these intrinsics can be used to control them.
these intrinsics are not easy to use, but they "are not meant to be added to code without measuring the impact to ensure they do not end up degrading performance".

Coming back to Rust, I do think that we need something like likely/unlikely, but I don't like the way the intrinsics are currently implemented:

They only apply to if-statements. There is no support for if let, match, ... , that is, if I want to use them in a match statement I need to rewrite the match as a sequence of if-statements. This is very unergonomic.
They make the expression "opaque" kind of like test::black_box preventing optimizations due to the way they are currently implemented. This is not what the intrinsics should be doing.
We already have the #[cold] attribute to indicate that a function is unlikely to be called, having a separate unlikely intrinsic/attribute looks to me like unnecessary duplication (if not, the documentation failed to convey me the difference).

Basically what I want is a way to override the optimizers default of using cmov, by stating "this branch is predictable and is going to be/not be taken often" that works on all types of branches (match, if let, ...). I guess that for this to work it would need to be an attribute, and also, be passed to LLVM as branch prediction hints.

steveklabnik commented 6 years ago

So, the C++ comittee is talking about standardizing likely/unlikely, and herb sutter had this to say: https://herbsutter.com/2018/04/02/trip-report-winter-iso-c-standards-meeting-jacksonville/

For example, for some code it is essential to make the uncommon path faster so that it is faster when you do hit it — and in such cases you would actually use this attribute in the inverted sense, so please also write a comment nearby to document that intent otherwise some hapless reader (more than likely your future self) will be temporarily confused.

Some people on Reddit have ideas for different names that don't have this drawback.

gnzlbg commented 6 years ago

@steveklabnik Thanks for pointing that out!

The current blocker here is that the current API is broken beyond all repair.

I personally would prefer just allowing #[cold] in more situations to mean unlikely, and adding a #[hot] attribute that's the opposite of #[cold] to mean likely. But right now, any solution, even one with a bad name, would be infinitely better than what we have. Adding #[likely] and #[unlikely] attributes to unstable Rust would probably be the right first step to gain implementation experience without having to mess with the meaning of #[cold]. Whether these can be unified with hot/cold or should be renamed to something else is something that can be worked out afterwards.

Amanieu commented 6 years ago

I get a consistent 10-15% speedup in hashbrown (a fast hash table implementation) using these intrinsics.

gnzlbg commented 6 years ago

@Amanieu the problem is that one cannot use these intrinsics in match, if-let, ... statements, so we need something better here.

I think enough proof / motivation has already been given to convince most that these are useful, and P0479r2 contains a thorough performance analysis of properly and improperly using these intrinsics.

As mentioned I wish we could just use #[hot] and #[cold] for this (two attributes that already exist):

if foo() { #![hot] } else { #![cold] }
if let Some(v) = opt { #![cold] }
match opt {
    Some(v) => { #![hot] ... },
    None => { #![cold] ... }.
}

but I don't think that it is easy to lower these to an llvm.expect intrinsic on the value that's responsible for the branch landing there.

hsivonen commented 5 years ago

The current blocker here is that the current API is broken beyond all repair.

That's a bit of an exaggeration. The current design does work for plain if if changed technically from intrinsics to lang items to that unsafe isn't needed by policy.

Also, #![hot] and #![cold] have a different problem: They are at least confusing in the case of a chained if, which you'd have to rewrite as nested ifs instead.

gnzlbg commented 5 years ago

That's a bit of an exaggeration.

If you want to apply this to match, if let, while let, etc., you'd have to do "something like this":

match foo {
   A => ...,
   core::intrinsic::likely(B) => ...,
   core::intrinsic::likely(C if ...) => ...,
   C => ...,
   _ => ...,
}

where B and C if ... are the pattern you are matching and core::intrinsic::likely is something that has to somehow disappear at the syntax level. Even for simple if-else code one cannot easily annotate the else branch using intrinsics:

if foo { 
   ...
} else if ...  { ... }
} else if ...  { ... }
} else if ...  { ... }
} core::intrinsics::unlikely(else) {  // ??
   ...
}

The core::intrinsics::{likely,unlikely} approach is definitely good enough as unstable stop gap until we get something better. The performance consequences of these annotations are very hard to predict correctly, and we want people to use them only after appropriately evaluating that they get a positive effect. A solution that requires people to transform all of their conditional code into a series of if ... { ... } else { ... } statements just to be able to apply these intrinsics and test their effects is not IMO something worth stabilizing. This is what I meant with "the intrinsics approach is broken beyond all repair".

We need something that lets people just write:

match foo {
   A => ...,
   #[likely] B => ...,
   #[likely] C if ... => ...,
   C => ...,
   _ => ...,
}

if foo { 
   ...
} else if ...  { ... }
} else if ...  { ... }
} else if ...  { ... }
} #[unlikely] else { 
   ...
}

or

match foo {
   A => ...,
   B => #![likely]  ...,
   C if ... => #![likely] ...,
   C => ...,
   _ => ...,
}

if foo { 
   ...
} else if ...  { ... }
} else if ...  { ... }
} else if ...  { ... }
} else { 
  #![unlikely] 
   ...
}

or similar, so that testing these only requires a 1 line change.

Gankra commented 5 years ago

I would like to propose stabilizing the likely/unlikely functions as-is to enable projects on stable (like hashbrown) to make use of these features where they can. Having them doesn't in any way block creating a better version later.

SimonSapin commented 5 years ago

How do these intrinsics relate to the #[cold] attribute on functions? Has #[cold] been tried in hashbrown?

gnzlbg commented 5 years ago

How do these intrinsics relate to the #[cold] attribute on functions?

cold is a calling convention on functions, while likely/unlikely annotate conditional jumps.

hsivonen commented 5 years ago

If you want to apply this to match, if let, while let, etc

I'm not claiming that likely()/unlikely() would address those use cases. I'm saying that likely()/unlikely() would work for plain if and right now the search for perfection means we don't get to have prediction hints for plain if, either.

Even for simple if-else code one cannot easily annotate the else branch using intrinsics:
if foo { 
   ...
} else if ...  { ... }
} else if ...  { ... }
} else if ...  { ... }
} core::intrinsics::unlikely(else) {  // ??
   ...
}

Since plain ifconditions are evaluated as if the conditions were evaluated in order, the above should be:

if foo { 
   ...
} else if ...  { ... }
} else if ...  { ... }
} else if likely(...)  {
  ...
} else {
   ...
}

OTOH, what does the following mean:

if foo {
  ...
} else if bar {
  #![hot]
} else {
  ...
}

Does the middle annotation affect the prediction of foo, bar or both?

pickfire commented 5 years ago

Is not match supposed to be arranged in the order of likeliness? Same goes for branching.

match foo {
    A => ...,
    B => ...,
}

As well, is it just likely and unlikely or there maybe be something in middle?

gnzlbg commented 5 years ago

Is not match supposed to be arranged in the order of likeliness?

No, it is not supposed to be arranged that way (you can do that though if you feel like it).

rust-lang / rust

Implement likely/unlikely intrinsic (tracking issue for RFC 1131) #26179