Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

"-enable-machine-outliner" creates symbols with non-unique names #38499

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR39526
Status NEW
Importance P normal
Reported by Andrew Grieve (agrieve@google.com)
Reported on 2018-11-01 17:47:22 -0700
Last modified on 2019-03-07 10:25:48 -0800
Version trunk
Hardware PC All
CC francisvm@yahoo.com, huangs@google.com, jpaquette@apple.com, llvm-bugs@lists.llvm.org, matze@braunis.de, pasko@chromium.org, richard-llvm@metafoo.co.uk, rnk@google.com, yvan.roux@linaro.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Notice when compiling Chrome for Arm64 that there are a lot of symbols called "OUTLINED_FUNCTION_1", "OUTLINED_FUNCTION_2", ...

Chrome uses an orderfile to optimize the layout of functions within the binary, but these outlined symbols are ambiguous when listed in an orderfile.

It would be great if the outlined functions could be given unique names (and stable) names. E.g. using a hash of some sort (maybe of the source path).

Quuxplusone commented 5 years ago
Maybe we can try to reduce the likelihood of non-uniqueness and improve the
debug experience by naming the outlined function after the first function it
was outlined from. For example, if 'foo' and 'bar' have common code, emit
assembly like:

foo:
  ...
bar:
  ...
foo.outlined.1:
  ... common code

It seems a little more user friendly than a hash, and the '.' separator will
allow it to pass through demanglers like c++filt.
Quuxplusone commented 5 years ago
I think that adding some sort of context is a reasonable solution. However,
we'd have to make sure that the order that we process the functions in doesn't
change.

It would definitely be nice to actually have unique names for the outlined
functions though. If we did that, then we could make the functions linkonceodr
and leverage linkers that support deduplication based off function contents.

Another option, which ought to be able to give us a canonical name, might be to
name the functions based off of their contents somehow? E.g

foo:
  some arithmetic stuff
bar:
  the same arithmetic stuff
outlined.(something uniquely describing the stuff):
  the arithmetic stuff

I'm not sure how that could be done without giant names though.
Quuxplusone commented 5 years ago
To avoid the processing order issue and keep debug experience in mind, maybe we
can include the sorted list of callers in the name, like:

Outlined.bar.foo.1:
   ...

Having a description of the content (like the list of mnemonics) sounds nice to
have, but the names can indeed become gigantic
Quuxplusone commented 5 years ago

I think that we can at least improve the stability of the outlined functions by

(1) Generating the names when we actually create the functions rather than early on

and

(2) Sorting the outlined functions by contents

This gets us away from relying on the order of processing outlining candidates for names, at least. Even if we used the numbering-based scheme here, I think we'd be in a better place than before. (Although I'd personally like something more human-readable and debuggable at the end of the day.)

Quuxplusone commented 5 years ago
I noticed that in

https://github.com/llvm-mirror/llvm/blob/master/include/llvm/IR/ModuleSummaryIndex.h

there's this code:

********
  /// Convenience method for creating a promoted global name
  /// for the given value name of a local, and its original module's ID.
  static std::string getGlobalNameForLocal(StringRef Name, ModuleHash ModHash) {
    SmallString<256> NewName(Name);
    NewName += ".llvm.";
    NewName += utostr((uint64_t(ModHash[0]) << 32) |
                      ModHash[1]); // Take the first 64 bits
    return NewName.str();
  }
========

Would this be something that might be helpful (re. adding hash)?
Quuxplusone commented 5 years ago
Some further findings on this:
* Outlined functions do not get instrumented when using -finstrument-function-
entry-bare, which Chrome uses to generate an orderfile.

Rather than including outlined functions in instrumentation, it would probably
be even better if they were named $ORIGINAL_FUNCTION.outlined,
$ORIGINAL_FUNCTION.outlined_1, etc, and then handled by -section-ordering-file
to go beside any parent function listed in the file.