Open Quuxplusone opened 6 years ago
Bugzilla Link | PR38832 |
Status | CONFIRMED |
Importance | P enhancement |
Reported by | Matt Davis (matthew.davis@sony.com) |
Reported on | 2018-09-04 13:58:29 -0700 |
Last modified on | 2021-08-25 04:25:15 -0700 |
Version | trunk |
Hardware | PC All |
CC | ahmed@bougacha.org, andrea.dibiagio@gmail.com, andrea_dibiagio@sn.scee.net, atrick@apple.com, clement.courbet@gmail.com, craig.topper@gmail.com, gchatelet@google.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, matthew.davis@sony.com, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Just to clarify.
The end goal is to enable the definition of "code regions" directly from C/C++.
To achieve that goal we need to:
a) teach llvm-mca how to identify code regions in a binary, and
b) teach llvm how to generate information related to code regions.
Matt's post is about a).
The idea is to describe code regions as ranges
< base_address, length >.
Ideally, base_address is an offset (.text relative).
Code regions are encoded in a new section of the binary.
The advantages are:
- We don't modify the executable code (for example, IACA adds special prefixes to x86 instructions to mark the start/end of a region).
- llvm-mca doesn't need to parse the full binary to identify code regions.
- If people want to get rid of that information, they can just strip that new section from the binary.
--
The full feature would be structured in two parts (and it would require a
proper RFC upstream).
1) teach llvm-mca how to process that new section and construct a simulation
pipeline for every code region.
1b) error out if the binary doesn't specify any code region.
2) teach llvm codegen how to generate this new section.
About point 2.
We could add two new llvm intrinsics. Example:
@llvm.mca.region.start(const char *OptionalNameOfTheRegion);
@llvm.mca.region.end();
These special intrinsics are codegen only, and simply act as "compiler
barriers" to avoid that optimization passes shuffle code accross the boundaries
of a region.
AsmPrinter would eventually lower those new intrinsics into temporary labels.
At the same time, region descriptors (pairs of labels) are added to our new
"mca section".
In clang, we could use two new builtins which are Clang CodeGen'd into llvm.mca
intrinsic calls.
--
Hopefully this explains the full process.
I have spoken with Matt about this idea, and asked him to raise this bug, so
that he can start working on it, and come up with a prototype (possibly before
the US LLVM conference, so that we can demo this in case, and collect feedback
from people before sending and RFC).
Reset assignee to default.
I don't think that Matt is working on this.