llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.78k stars 10.97k forks source link

[BOLT] Evaluate and speedup discoverFileObjects #90661

Open aaupov opened 2 months ago

aaupov commented 2 months ago

Operations with symbols such as name lookup are expensive and some binaries have millions of symbols, so it makes sense to try to reduce the symbol-related overheads.

Measure how much the following cost and refactor/eliminate:

  1. Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol: https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L824-L837
  2. FileName set for every local function symbol (with O(1M) for large binaries) while we can instead use file symbol lookup with O(log N_Syms) cost and FileSyms memory cost https://github.com/llvm/llvm-project/pull/89088
  3. Merge iteration over two loops over symbols into one: https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L824 and https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L876
llvmbot commented 2 months ago

@llvm/issue-subscribers-bolt

Author: Amir Ayupov (aaupov)

Operations with symbols such as name lookup are expensive and some binaries have millions of symbols, so it makes sense to try to reduce the symbol-related overheads. Measure how much the following cost and refactor/eliminate: 1. Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol: https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L824-L837 2. FileName set for every local function symbol (with O(1M) for large binaries) while we can instead use file symbol lookup with O(log N_Syms) cost and FileSyms memory cost https://github.com/llvm/llvm-project/pull/89088 3. Merge iteration over two loops over symbols into one: https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L824 and https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L876
llvmbot commented 2 months ago

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

llvmbot commented 2 months ago

@llvm/issue-subscribers-good-first-issue

Author: Amir Ayupov (aaupov)

Operations with symbols such as name lookup are expensive and some binaries have millions of symbols, so it makes sense to try to reduce the symbol-related overheads. Measure how much the following cost and refactor/eliminate: 1. Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol: https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L824-L837 2. FileName set for every local function symbol (with O(1M) for large binaries) while we can instead use file symbol lookup with O(log N_Syms) cost and FileSyms memory cost https://github.com/llvm/llvm-project/pull/89088 3. Merge iteration over two loops over symbols into one: https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L824 and https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L876
corona10 commented 2 months ago

Well, Can I take a look at it?

aaupov commented 2 months ago

Well, Can I take a look at it?

Yes, go for it!

corona10 commented 2 months ago

@aaupov Do you have any recommended projects as the target binary to measure the performance overhead?

aaupov commented 2 months ago

@aaupov Do you have any recommended projects as the target binary to measure the performance overhead?

The largest that you can get your hands on, the more symbols the better. Or CPython if it's the one that you care about the most. Clang is a reasonably large, important, and straightforward project to work with.

corona10 commented 2 months ago

perf

There are several issues (zero stack count.. :( ) when converting to the flame graph, but I success in getting the information about discoverFileObjects. It is not huge stuff for the CPython build (But it's worth checking it from Chromium or Clang) (For the CPython, around 10513 symbols were evaluated)

Anyway, I will start to optimize with following what you proposed :)

corona10 commented 2 months ago

@aaupov By the way, I noticed that there is a code that sorts 1M symbols through the rewriting process. https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L899

Even if the vector is an efficient data structure for most cases, it will require huge time complexity. Do you think that it's worth optimizing to use a sorted data structure? (O(n) + O(nlogn) -> O(nlogn), will be kind of constant time optimization.)

aaupov commented 2 months ago

I don't think there's much opportunity here. Let's keep this task open to address points 1 and 2.

corona10 commented 2 months ago

So, I have questions about point 1.

Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol:

So, do you want to make NameOrError == anas_init_string or compare the string with all __asan_init variations? These are pretty naive questions, but due to my lack of background knowledge, I don't know why the original author used starts with operation for this, but to solve this issue, I should know.

When I searched about __asan_init there are kind of variation exists such as __asan_init_v2 or __asan_init_v3

aaupov commented 2 months ago

So, I have questions about point 1.

Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol:

So, do you want to make NameOrError == anas_init_string or compare the string with all __asan_init variations? These are pretty naive questions, but due to my lack of background knowledge, I don't know why the original author used starts with operation for this, but to solve this issue, I should know.

When I searched about __asan_init there are kind of variation exists such as __asan_init_v2 or __asan_init_v3

I did experiments with removing these checks entirely and found that they make very little difference. So I guess it's just pt 2 that is worth merging, and it has a PR already.

If you're interested, I can try to find other good first issues in BOLT.

corona10 commented 2 months ago

If you're interested, I can try to find other good first issues in BOLT.

Yeah, it will be happy :)