Open nikomatsakis opened 9 years ago
FWIW, looking through my own code, I actually think the fact that this can't apply to match variants is going to be fairly hard on idiomatic Rust (for example, it means you can't use it when matching on an Option
, which is going to be extremely common since it amounts to if (likely(ptr != NULL))
in C).
@pythonesque once I get this done, I plan on writing a follow-up RFC for a #[likely]
attribute you can put on match arms. Since it's already possible to put attributes on arms (to support conditional compilation of them), it should be fairly easy to implement.
It might be worth having an "unpredictable" option in addition to likely and unlikely. There is now something to lower it to in LLVM: http://reviews.llvm.org/D12341
@Aatch Are you still working on this?
Shall #[likely]
be also on Result?
enum Result<T, E> {
#[likely]
Ok(T),
Err(E)
}
I expect for most code Ok is likely and Err is unlikely. Of course, annotating if
or match
should override enum variant's annotation.
@vi I think hinting should be only be used rarely, so it being on the type doesn't seem like a good idea. Result
is a bit more general purpose than just error-handling, sot that kind of hinting would impact code where it's inappropriate. Hinting should only really be done at the site itself.
@Aatch, Is it a good idea to have likely
in every place where something is likely? Can it also serve for self-documentation: follow "likelies" and see happy path?
@vi is is not. As a general rule, people are pretty bad at knowing what cases are actually likely or not. Branch hinting is a fairly specific optimisation hint, used incorrectly it can actually make code slower. It should also only really be used when the unlikely case is known to be very unlikely.
Basically, you should only use it if it makes your code faster, otherwise it should be there.
@vi It might be good idea if the likely thing is cheap, and the unlikely thing is expensive. For example, Vec::push
is expensive if it causes a reallocation, and very cheap otherwise.
In my opinion, frequent use of likely
is either unnecessary or just manual profile-guided optimization.
Instead of arguing about it, why not just test it? Annotate Result
, and then see if rustc bootstraps significantly faster. This won't give a definitive answer but will help determine if it's worth considering.
Now that MIR is a thing, it would be nice if this were implemented.
@comex I do agree =)
@Aatch Not sure what your time is like, but if you don't have time to implement this, it seems like a case where we might be able to mentor someone. The idea would be to leave a comment explaining roughly how you think it should work, and then tagging the issue with E-mentor
and E-help-wanted
. I of course can do this too and will try to remember to do so. =)
Unassigning @Aatch since, empirically, he doesn't seem to be actively hacking on this right now. =) (Feel free to correct me.)
@nikomatsakis PR up, though I could use some help adding tests.
Copying @nagisa's comment from #36181:
llvm told me expect intrinsic is more harmful than it is helpful and that the branch weight metadata should be emitted instead. Might be worth to go back and reconsider the RFC, since the branch weight metadata is more powerful and expressive and would be better expressed as attributes in source too.
why is clang > 3.6 not emiting expect intrinsic for __builtin_expect? nagisa: it turns out that the expect intrinsic overall harmed the optimizer by inhibiting certain optimizations so we now directly lower __builtin_expect to branch probabilities in the frontend when generating IR
I think that keeping the intrinsics is a good idea. They should be fairly obvious for people coming from C/C++. Attributes aren't mutually exclusive with the intrinsics, so we're not exactly cutting ourselves off with intrinsics.
Supporting branch weights at the MIR level would probably be the easiest implementation for lowering to the metadata ourselves.
Just adding a note here as I'm processing everything in #38643: it will need docs before stabilization. @nikomatsakis would you mind adding a note to that effect in the root of the issue?
@chriskrycho done
Will these intrinsics be stabilized as they are? Under which path?
Every few months, because I've forgotten the results, I try these out in httparse, on branches I'm absolutely certain are always taken, and see no gain, with a wild range in variance when running the becnhmarks.
I'm not adept at all with either assembly or llvm-ir when compiling with optimizations to tell how they are implemented.
There's older comments that the expect intrinsic in LLVM is harmful, that it should be done with branch weights instead. But when I go diving in LLVMs source now, it appears there is a pass that already lowers expects into branch weights anyways?
The reason they're harmful is because they're still very opaque calls all the way till they're lowered, inhibiting all the early optimisations which are not specifically handling these intrinsics. Just don't use them.
On Oct 21, 2017 20:30, "Sean McArthur" notifications@github.com wrote:
Every few months, because I've forgotten the results, I try these out in httparse, on branches I'm absolutely certain are always taken, and see no gain, with a wild range in variance when running the becnhmarks.
I'm not adept at all with either assembly or llvm-ir when compiling with optimizations to tell how they are implemented.
There's older comments that the expect intrinsic in LLVM is harmful, that it should be done with branch weights instead. But when I go diving in LLVMs source now, it appears there is a pass that already lowers expects into branch weights anyways?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/26179#issuecomment-338419031, or mute the thread https://github.com/notifications/unsubscribe-auth/AApc0vcOu7V2ntyt5JrLHlIuaG1coDmvks5suipBgaJpZM4E-jMn .
The reason they're harmful is because they're still very opaque calls all the way till they're lowered, inhibiting all the early optimisations which are not specifically handling these intrinsics. Just don't use them.
An earlier comment says (if I'm reading it right) that clang deals with the corresponding C intrinsic by generating branch probabilities in the IR instead of IR intrinsic.
Are there technical reasons against doing the same in rustc?
No reasons against generating branch probabilities in rustc, however it is not as trivial to implement it that way (i.e. go from a function call to probabilities in IR)
One could as well just go ahead and add the attributes that are more general and easier to implement.
I have to say, I am personally inclined to just remove this intrinsic. The results seem to be uneven at best and I feel no strong urgency to jump on this.
The results seem to be uneven at best
You mean profile-guided optimization is a hoax?
According to the Intel optimization manual, the order in which branches are laid out in assembly does matter (likely code should come first, unlikely comes after).
However, the question is if LLVM isn't already smart enough to reorder stuff by itself. These intrinsics were added a long time ago, when compilers were bad at predicting stuff.
From what I've seen, if you use an unwrap
or check for an error code (e.g. right before return Err
), the compiler is smart enough to optimize the code and detect the more likely branch.
Even the Linux kernel has an option to allow the compiler to un-inline certain functions because newer versions of GCC are much better at inlining. Same thing could happen with likely
/ unlikely
: if the compiler is smart enough, there is no need for the intrinsic.
TL; DR: these intrinsics are useful for people who love premature optimization, believing they can outsmart the compiler.
As a workaround, you can simply write your code such that the generated assembly is laid out as you want.
Or use PGO. A quote from the GCC manual:
In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform.
TL; DR: these intrinsics are useful for people who love premature optimization, believing they can outsmart the compiler.
Annotating the code with likely/unlikely can also help readability. What is marked likely for the compiler is also interpreted as likely by the reader, who can follow the "main path" in the code more easily.
@pythonesque once I get this done, I plan on writing a follow-up RFC for a #[likely] attribute you can put on match arms. Since it's already possible to put attributes on arms (to support conditional compilation of them), it should be fairly easy to implement.
We could perhaps just allow #[cold]
and/or #[inline(never)]
on blocks/statements instead of just functions:
let x = if condition {
....
} else #[cold] {
....
};
let y = match condition2 {
Option(x) => x,
None => #[cold] 1
};
Regarding likely/unlikely specifically, I think people are expecting that they'd only be used inside if
expressions, but I don't see that restriction in the RFC.
Anyway, in lots of the code I write, regarding parsing and cryptography, we really do always want the "no problems" path to be fast. Not only do we generally not care about the performance of the unlikely "oops, that's not how it's supposed to be" path, but we in fact would like to have the compiler actively de-optimize that path, moving it far away from the fast path, to improve instruction cache locality.
Also, PGO sounds nice but for many applications, like most of mine, it would be a huge waste. Its hard enough to get good code coverage; it's nearly impossible to develop PGO-focused coverage that emphasizes the code paths for which one most cares about performance.
TL; DR: these intrinsics are useful for people who love premature optimization, believing they can outsmart the compiler.
Possibly. And possibly these don't generate the best LLVM IR for the purpose given how LLVM consumes IR today.
However, I think it's not productive to suggest that the compiler always outsmarts the programmer and, therefore, there's no need to provide detailed control as an option. I think the appropriate use for intrinsics like these is being able to persist manual PGO (iterating cargo bench
with different branch prediction hints) in the source code without requiring the user of the crate to set up an appropriate benchmark for compile-time profiling.
Last week, I had a match
with consecutive (both in source order and in numerical order of the condition) identical arms and combining the identical match arms destroyed about 100 MB/s of throughput (from 9xx MB/s to 8xx MB/s), because the compiler put the checks in a different order. By rewriting the code as manually-ordered if
s, I was able to do better than the original match
.
Even in the standard library, there are quite a few places where these kinds of improvements to optimization hints would be useful. We shouldn't need to factor out code into a separate function just to mark it unlikely/cold like in https://github.com/rust-lang/rust/commit/257bff3192e2c7313a4f8cfcac8839a573b42f6b#diff-5da4ecf5ad221f68bde0ae13f2a7c22dR700, for example.
Also, in my code, I generally want the compiler to optimize, to an extreme extent, likely()
paths for speed and cache locality. Similarly, I want it to optimize, to an extreme extent, unlikely()
/#[cold]
paths for size and out-of-the-way cache locality. Also, like Henri said, sometimes predictable performance is very important and sometimes the profile-guided optimization has been done by hand to ensure such predictable performance. Hints like likely()
/unlikely()
/#[cold]
are the only way we have to transfer the knowledge from the profiling phase to the codegen phase in those situations.
Has anyone successfully used these intrinsics to achieve speedups?
I prefer the #[cold]
formulation, myself.
I didn't mean to suggest that there is no role for user-hinting, but merely that this particular formulation didn't seem like a big success. The LLVM developers (from what I understand) don't recommend using these instrinsics, and they seem to be usable in confusing ways (e.g., outside of an if
).
Has anyone successfully used these intrinsics to achieve speedups?
@nikomatsakis, in the way they're implemented the intrinsics aren't very useful, but this can be improved by "generating branch probabilities in the IR", like clang does.
Using likely and unlikely (__builtin_expect
) we can achieve the speedups in C (cf. https://stackoverflow.com/questions/109710/likely-unlikely-macros-in-the-linux-kernel-how-do-they-work-whats-their), so why shouldn't we be able to do this in Rust, given that the intrinsics are already there? Why not simply fix them to be useful?
I prefer the
#[cold]
formulation, myself.
Do you mean #[cold]
on block statements?
On the subject of whether they're useful: I've used them myself (though I've not benchmarked enough to tell whether they made a difference). One thing I did notice was that rustc
doesn't appear to be able to do constant propagation through them because I had a call to unlikely(<expr>)
where <expr>
could easily be constant-propagated to false
at compile-time, but the corresponding if
branch was still included in the compiled output.
Has anyone successfully used these intrinsics to achieve speedups?
On most of my attempts, not, but that might have been due to wrapping them in inlines, which according to another issue doesn't work.
Today, however, I finally found a case where I saw a notable boost. If I add unlikely
to this if
, I get a 33% performance improvement on Russian plain text. (In case anyone thinks that the function as a whole is an unnecessary optimization, it's notably faster than iterating over a str
by char
and calling is_char_bidi()
for each one.)
Also: Why are likely()
and unlikely()
currently unsafe
? As far as their non-performance semantics go, they just return their argument, which is safe, and modifying performance is safe, too.
On most of my attempts, not, but that might have been due to wrapping them in inlines, which according to another issue doesn't work. .. Also: Why are likely() and unlikely() currently unsafe? As far as their non-performance semantics go, they just return their argument, which is safe, and modifying performance is safe, too.
Not good reasons. I think it's because of the (imo) rather artificial distinction between "intrinsics" and "lang items" in the compiler. I've been meaning forever to write an RFC collapsing the distinction -- basically just having "something well known to the compiler". But for whatever reason we have a distinction now, and intrinsic functions are (a) always unsafe and (b) never exposed directly (hence the inlining).
Why not simply fix them to be useful?
My feeling was that since they are not working that well now (and are not in any particular use) and the because the interface is Not Great, maybe we should rip it out and try something better.
The reasons that I think the interface is not great are roughly that a function doesn't seem like the optimal choice here:
let x = likely(...); if x { .. .}
? Is that the same as if likely(...) { ... }
?
likely
occurred is kind of painful around MIR tooetc
So, in short, I don't have a strong opinion, just looking for opportunities to cull underutilized and not especially succesful language features and replace with something better.
My feeling was that since they are not working that well now (and are not in any particular use) and the because the interface is Not Great, maybe we should rip it out and try something better.
Good point, @nikomatsakis !
It might take some time though for a different branch hinting approach to come into the language.
I think that in Rust we have a strong precedence of having practical, but Not Great, solutions to be implemented in the language in the unstable form. It allows us to have our pie and eat it too.
As of now Rust might be at some disadvantage over C/C++ in regards of branch hinting. There are existing algorithms and programs that have been optimized in C/C++ to use branch hinting efficiently, and currently it might be problematic to port them to Rust while maintaining the same level of performance, because we lack a branch hinting solution of a quality comparable to GCC and CLANG implementations.
Implementing the likely/unlikely intrinsic as an unstable feature, till we have a better one, might be good for language adoption. It might also help those members of the community who want to port their code to a safer language but are hindered by performance constraints.
I think that a lot of us flock to Rust because it offers performance. It's one of the trifecta on the https://www.rust-lang.org/ page. Personally, I started to believe in Rust as a performance language when I've seen the likely/unlikely RFC merged. It pains me to see that Rust still isn't there.
Can I do something to convince you that we need something working until we have something better?
@ArtemGr You already have it as unstable feature: std::intrinsics::unlikely
. Using it and evaluating the current implementation is already possible. As you can see the discussion here is already in relation to the existing implementation and its flaws.
You already have it as unstable feature: std::intrinsics::unlikely. Using it and evaluating the current implementation is already possible.
@bluss , it's not currently on par with CLANG / GCC implementations.
As you can see the discussion here is already in relation to the existing implementation and its flaws.
That's right. We're discussing whether to remove or to improve this feature.
From P0479r2: Attributes for Likely and Unlikely Statements (Revision 2):
likely
/unlikely
are particularly widely used in systems programming (linux kernel and web browsers: Firefox and Chrome). cmp/jmp
for predictable branches instead of cmov
instructions in x86. Worth mentioning is that some architectures, like PowerPC, have instructions to override dynamic branch prediction, and these intrinsics can be used to control them.Coming back to Rust, I do think that we need something like likely/unlikely
, but I don't like the way the intrinsics are currently implemented:
if
-statements. There is no support for if let
, match
, ... , that is, if I want to use them in a match
statement I need to rewrite the match
as a sequence of if
-statements. This is very unergonomic. test::black_box
preventing optimizations due to the way they are currently implemented. This is not what the intrinsics should be doing.#[cold]
attribute to indicate that a function is unlikely to be called, having a separate unlikely
intrinsic/attribute looks to me like unnecessary duplication (if not, the documentation failed to convey me the difference).Basically what I want is a way to override the optimizers default of using cmov
, by stating "this branch is predictable and is going to be/not be taken often" that works on all types of branches (match, if let, ...). I guess that for this to work it would need to be an attribute, and also, be passed to LLVM as branch prediction hints.
So, the C++ comittee is talking about standardizing likely
/unlikely
, and herb sutter had this to say: https://herbsutter.com/2018/04/02/trip-report-winter-iso-c-standards-meeting-jacksonville/
For example, for some code it is essential to make the uncommon path faster so that it is faster when you do hit it — and in such cases you would actually use this attribute in the inverted sense, so please also write a comment nearby to document that intent otherwise some hapless reader (more than likely your future self) will be temporarily confused.
Some people on Reddit have ideas for different names that don't have this drawback.
@steveklabnik Thanks for pointing that out!
The current blocker here is that the current API is broken beyond all repair.
I personally would prefer just allowing #[cold]
in more situations to mean unlikely, and adding a #[hot]
attribute that's the opposite of #[cold]
to mean likely. But right now, any solution, even one with a bad name, would be infinitely better than what we have. Adding #[likely]
and #[unlikely]
attributes to unstable Rust would probably be the right first step to gain implementation experience without having to mess with the meaning of #[cold]
. Whether these can be unified with hot
/cold
or should be renamed to something else is something that can be worked out afterwards.
I get a consistent 10-15% speedup in hashbrown
(a fast hash table implementation) using these intrinsics.
@Amanieu the problem is that one cannot use these intrinsics in match
, if-let
, ... statements, so we need something better here.
I think enough proof / motivation has already been given to convince most that these are useful, and P0479r2 contains a thorough performance analysis of properly and improperly using these intrinsics.
As mentioned I wish we could just use #[hot]
and #[cold]
for this (two attributes that already exist):
if foo() { #![hot] } else { #![cold] }
if let Some(v) = opt { #![cold] }
match opt {
Some(v) => { #![hot] ... },
None => { #![cold] ... }.
}
but I don't think that it is easy to lower these to an llvm.expect
intrinsic on the value that's responsible for the branch landing there.
The current blocker here is that the current API is broken beyond all repair.
That's a bit of an exaggeration. The current design does work for plain if
if changed technically from intrinsics to lang items to that unsafe
isn't needed by policy.
Also, #![hot]
and #![cold]
have a different problem: They are at least confusing in the case of a chained if
, which you'd have to rewrite as nested if
s instead.
That's a bit of an exaggeration.
If you want to apply this to match
, if let
, while let
, etc., you'd have to do "something like this":
match foo {
A => ...,
core::intrinsic::likely(B) => ...,
core::intrinsic::likely(C if ...) => ...,
C => ...,
_ => ...,
}
where B
and C if ...
are the pattern you are matching and core::intrinsic::likely
is something that has to somehow disappear at the syntax level. Even for simple if-else
code one cannot easily annotate the else
branch using intrinsics:
if foo {
...
} else if ... { ... }
} else if ... { ... }
} else if ... { ... }
} core::intrinsics::unlikely(else) { // ??
...
}
The core::intrinsics::{likely,unlikely}
approach is definitely good enough as unstable stop gap until we get something better. The performance consequences of these annotations are very hard to predict correctly, and we want people to use them only after appropriately evaluating that they get a positive effect. A solution that requires people to transform all of their conditional code into a series of if ... { ... } else { ... }
statements just to be able to apply these intrinsics and test their effects is not IMO something worth stabilizing. This is what I meant with "the intrinsics approach is broken beyond all repair".
We need something that lets people just write:
match foo {
A => ...,
#[likely] B => ...,
#[likely] C if ... => ...,
C => ...,
_ => ...,
}
if foo {
...
} else if ... { ... }
} else if ... { ... }
} else if ... { ... }
} #[unlikely] else {
...
}
or
match foo {
A => ...,
B => #![likely] ...,
C if ... => #![likely] ...,
C => ...,
_ => ...,
}
if foo {
...
} else if ... { ... }
} else if ... { ... }
} else if ... { ... }
} else {
#![unlikely]
...
}
or similar, so that testing these only requires a 1 line change.
I would like to propose stabilizing the likely/unlikely functions as-is to enable projects on stable (like hashbrown) to make use of these features where they can. Having them doesn't in any way block creating a better version later.
How do these intrinsics relate to the #[cold]
attribute on functions? Has #[cold]
been tried in hashbrown?
How do these intrinsics relate to the #[cold] attribute on functions?
cold
is a calling convention on functions, while likely
/unlikely
annotate conditional jumps.
If you want to apply this to
match
,if let
,while let
, etc
I'm not claiming that likely()
/unlikely()
would address those use cases. I'm saying that likely()
/unlikely()
would work for plain if
and right now the search for perfection means we don't get to have prediction hints for plain if
, either.
Even for simple if-else code one cannot easily annotate the else branch using intrinsics:
if foo { ... } else if ... { ... } } else if ... { ... } } else if ... { ... } } core::intrinsics::unlikely(else) { // ?? ... }
Since plain if
conditions are evaluated as if the conditions were evaluated in order, the above should be:
if foo {
...
} else if ... { ... }
} else if ... { ... }
} else if likely(...) {
...
} else {
...
}
OTOH, what does the following mean:
if foo {
...
} else if bar {
#![hot]
} else {
...
}
Does the middle annotation affect the prediction of foo
, bar
or both?
Is not match
supposed to be arranged in the order of likeliness? Same goes for branching.
match foo {
A => ...,
B => ...,
}
As well, is it just likely
and unlikely
or there maybe be something in middle?
Is not match supposed to be arranged in the order of likeliness?
No, it is not supposed to be arranged that way (you can do that though if you feel like it).
Tracking issue for rust-lang/rfcs#1131:
std::hint