Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[llvm-mca] Add binary support to llvm-mca. #37807

Open Quuxplusone opened 6 years ago

Quuxplusone commented 6 years ago
Bugzilla Link PR38832
Status CONFIRMED
Importance P enhancement
Reported by Matt Davis (matthew.davis@sony.com)
Reported on 2018-09-04 13:58:29 -0700
Last modified on 2021-08-25 04:25:15 -0700
Version trunk
Hardware PC All
CC ahmed@bougacha.org, andrea.dibiagio@gmail.com, andrea_dibiagio@sn.scee.net, atrick@apple.com, clement.courbet@gmail.com, craig.topper@gmail.com, gchatelet@google.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, matthew.davis@sony.com, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Currently, llvm-mca operates on asm source. The goal of this bug/feature is to
give llvm-mca the ability to operate on object files and not just asm source.
This change will require llvm-mca to enable/initialize the target disassemblers
and locate user-annotated code regions. Each code region should be executed in
a separate simulated pipeline within llvm-mca, similar to how llvm-mca
currently executes multiple code regions that have been annotated in the user's
asm source.

To locate user defined code regions in an object file, llvm will need to store
the start/end .text offsets, that represent the user's annotations, into a
MCSection of the user's object file.  For an ELF file, this can be a section
called ".mca_code_segments", where the content of this region is a set of
pairs: <.text offset begin, .text offset end>.  llvm-mca will locate these code
regions, and perform analysis on each region within a simulated pipeline.  This
change will probably require a pair of intrinsics, to represent the user's
annotations,  which will eventually be lowered into a representation that will
be used to make up the contents of the llvm-mca specific MCSection.

An example looks something like the following:

   .text
   .Lmca_segment1_start:
   ... # Code to be analyzed by mca
   .Lmca_segment1_end:
   ... # More code, not analyzed by mca
   .mca_code_segments:
    .Lmca_segment1_start,  .Lmca_segment1_end
Quuxplusone commented 6 years ago
Just to clarify.

The end goal is to enable the definition of "code regions" directly from C/C++.

To achieve that goal we need to:
a) teach llvm-mca how to identify code regions in a binary, and
b) teach llvm how to generate information related to code regions.

Matt's post is about a).

The idea is to describe code regions as ranges
    < base_address, length >.

Ideally, base_address is an offset (.text relative).

Code regions are encoded in a new section of the binary.
The advantages are:
 - We don't modify the executable code (for example, IACA adds special prefixes to x86 instructions to mark the start/end of a region).
 - llvm-mca doesn't need to parse the full binary to identify code regions.
 - If people want to get rid of that information, they can just strip that new section from the binary.

--

The full feature would be structured in two parts (and it would require a
proper RFC upstream).

1) teach llvm-mca how to process that new section and construct a simulation
pipeline for every code region.
  1b) error out if the binary doesn't specify any code region.

2) teach llvm codegen how to generate this new section.

About point 2.
  We could add two new llvm intrinsics. Example:
    @llvm.mca.region.start(const char *OptionalNameOfTheRegion);
    @llvm.mca.region.end();

These special intrinsics are codegen only, and simply act as "compiler
barriers" to avoid that optimization passes shuffle code accross the boundaries
of a region.

AsmPrinter would eventually lower those new intrinsics into temporary labels.
At the same time, region descriptors (pairs of labels) are added to our new
"mca section".

In clang, we could use two new builtins which are Clang CodeGen'd into llvm.mca
intrinsic calls.

--

Hopefully this explains the full process.

I have spoken with Matt about this idea, and asked him to raise this bug, so
that he can start working on it, and come up with a prototype (possibly before
the US LLVM conference, so that we can demo this in case, and collect feedback
from people before sending and RFC).
Quuxplusone commented 3 years ago

Reset assignee to default.

I don't think that Matt is working on this.