Open LADSoft opened 5 years ago
I've dug up the old Intel Pentium II optimization guideline pdf on the internet here which uses the same exact example code for branch prediction as the new manual, for the new manual, the pages that directly say what we want start Here, old manual on branch prediction is Here, and the old manual's method to predict is Here,
So, the rules that intel gives on Branch Prediction basically boil down to:
On Intel's new manuals, they have a system of showing Compiler/Assembler authors what each optimization's impact is and how general the optimization is, which should be helpful in deciding the priority to implement anything in the optimizer at all.
Hm, 2-6 would only be done if optimizations are on, correct? Do you know if the rule 3-6 are true for AMD, too?
https://www.amd.com/system/files/TechDocs/25112.PDF#page=142
It appears rules 3-6 are all true as well, with AMD being more specific about inline size (25 instructions), 3 jmps in a 16 byte block, unroll loops, arrange to be consistent with BPA, with a few more caveats that I currently do not have the time to search for and display out in a nice list.
https://www.agner.org/optimize/microarchitecture.pdf Agner seems to have compiled a good list of everything from P1 up.
I also thought it through and that unless we have indicators in code or PGO (profile guided optimization) such a feature like this is difficult to make, better branch folding however, is an optimization that is possible to do on regular code.
Please allow the user to override prediction by likely
/unlikely
, see https://stackoverflow.com/questions/109710.
@GitMensch yeah I think that was coming in the next milestone. C++2020 introduces those as attributes and I presume GNU C++ has attribute for it as well.
referring to @chuggafan's comment in 2018 there are possibly one or two cases where the compiler could assume the likeliness or unlikeliness of branches based on whether something like an 'if' statement exits a loop (via break or goto), for example... but it would largely be up to the user to specify in the absence of something like PGO...
Yes, IMHO the best way to perform branch prediction in absence of PGO is [[likely]] and [[unlikely]], depending on the arch/processor we're compiling for we can actually give it hints about how the branches will go, such as on x86, but those hints are ignored in the modern day so... https://www.agner.org/optimize/optimizing_assembly.pdf
Also, for doing likely vs unlikely branches, if we have data giving us that something is more likely to be taken we make that the one that we don't branch into and the code naturally flows into, otherwise we just branch to that. e.g.
; do logic here
CMP eax, 1 ; is likely to be 0, not 1, on a boolean comparison...
JEQ exit
; do more logic here
exit:
RET
Essentially, what we really should be doing is block reordering so that the likelyhood of a block being executed determines the order from most likely to least likely.
yeah i figured that was going to be how it goes... this should be relatively simple to implement I think. I figured the one case we could probably guess at was when you break out of a loop at the end of an if statement... the body of that if statement is a good guess at an 'unlikely' case lol. I haven't been able to come up with other examples yet though...
minimize branches and generally ease the way for branch prediction in the later pentiums to result in faster code...