Open aaupov opened 2 months ago
@llvm/issue-subscribers-bolt
Author: Amir Ayupov (aaupov)
Hi!
This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
test/
create fine-grained testing targets, so you can e.g. use make check-clang-ast
to only run Clang's AST tests.git clang-format HEAD~1
to format your changes.If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.
@llvm/issue-subscribers-good-first-issue
Author: Amir Ayupov (aaupov)
Well, Can I take a look at it?
Well, Can I take a look at it?
Yes, go for it!
@aaupov Do you have any recommended projects as the target binary to measure the performance overhead?
@aaupov Do you have any recommended projects as the target binary to measure the performance overhead?
The largest that you can get your hands on, the more symbols the better. Or CPython if it's the one that you care about the most. Clang is a reasonably large, important, and straightforward project to work with.
There are several issues (zero stack count.. :( ) when converting to the flame graph, but I success in getting the information about discoverFileObjects. It is not huge stuff for the CPython build (But it's worth checking it from Chromium or Clang) (For the CPython, around 10513 symbols were evaluated)
Anyway, I will start to optimize with following what you proposed :)
@aaupov By the way, I noticed that there is a code that sorts 1M symbols through the rewriting process. https://github.com/llvm/llvm-project/blob/a1423ba4278775472523fed074de6dbdfd01898a/bolt/lib/Rewrite/RewriteInstance.cpp#L899
Even if the vector is an efficient data structure for most cases, it will require huge time complexity. Do you think that it's worth optimizing to use a sorted data structure? (O(n) + O(nlogn) -> O(nlogn), will be kind of constant time optimization.)
I don't think there's much opportunity here. Let's keep this task open to address points 1 and 2.
So, I have questions about point 1.
Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol:
So, do you want to make NameOrError == anas_init_string
or compare the string with all __asan_init
variations?
These are pretty naive questions, but due to my lack of background knowledge, I don't know why the original author used starts with operation for this, but to solve this issue, I should know.
When I searched about __asan_init
there are kind of variation exists such as __asan_init_v2
or __asan_init_v3
So, I have questions about point 1.
Checks for symbol name being equal to __asan_init and __llvm_coverage_mapping for every single symbol:
So, do you want to make NameOrError ==
anas_init_string
or compare the string with all__asan_init
variations? These are pretty naive questions, but due to my lack of background knowledge, I don't know why the original author used starts with operation for this, but to solve this issue, I should know.When I searched about
__asan_init
there are kind of variation exists such as__asan_init_v2
or__asan_init_v3
I did experiments with removing these checks entirely and found that they make very little difference. So I guess it's just pt 2 that is worth merging, and it has a PR already.
If you're interested, I can try to find other good first issues in BOLT.
If you're interested, I can try to find other good first issues in BOLT.
Yeah, it will be happy :)
Operations with symbols such as name lookup are expensive and some binaries have millions of symbols, so it makes sense to try to reduce the symbol-related overheads.
Measure how much the following cost and refactor/eliminate: