facebook / mariana-trench

A security focused static analysis tool for Android and Java applications.
https://mariana-tren.ch/
MIT License
1.1k stars 139 forks source link

Evaluate Profile-Guided Optimization (PGO) and LLVM BOLT #137

Open zamazan4ik opened 1 year ago

zamazan4ik commented 1 year ago

Feature Request

Is your feature request related to a problem? Please describe. Not exactly a problem. Just an idea of how to improve the performance.

Describe the solution you'd like According to my tests, PGO shows measurable improvements in compiler-like workloads (Clang-Tidy, Clangd, Clang, GCC, Rustc, etc.) I think it could be helpful to check PGO for the Mariana-Trench project too. E.g. here you can check the PGO results for Clang-Tidy.

We need to perform PGO benchmarks on Mariana-Trench. And if it shows improvements - add a note to the documentation about possible improvements in Mariana-Trench performance with PGO. Providing an easier way (e.g. a build option) to build scripts with PGO can be useful for the maintainers and end-users too.

Describe alternatives you've considered Do nothing :)

Additional context As an additional step after applying PGO, I can recommend trying to use LLVM BOLT.

arthaud commented 1 year ago

Sure, PGO and LLVM BOLT are great. We could try LTO as well.

Some feedback for your issue and your repository:

Regarding implementing PGO for this project: the setup seems non-trivial (not quite mature) and integrating it in our build system will be complicated (we have 2 of those, actually). Knowing that the performance win is not guaranteed, it seems like a risky project to take on.

NB: I can't help but notice that you opened a similar issue on more than 90 repositories. That feels a bit spammy. Not sure if there was a better way to do it.

zamazan4ik commented 1 year ago

It could helpful if you could give directly in your issue some concrete numbers for the potential win.

Sure. I cannot give you right now the potential win numbers for Mariana-Trench itself (since I didn't integrate the PGO in it yet). But as an estimation I can suggest you my results for Clang-Tidy. It's also a static analysis tool so Clang-Tidy's result should be relevant here. If one day I get the results for Mariana-Trench - definitely I will post them here.

I personally found it a bit hard to navigate your repository with hundreds of links

Thanks for the feedback! Right now I am trying to figure out, what is the better way to present all this information. So for now I can suggest navigating over Ctrl+F with the name of your project (like "Clang"). Also, I've tried to group the links of similar projects (similar == the same domain) into one place like "Compilers", "Operating systems", etc. Hope it helps.

some are broken for me, for instance the one for Firefox results

Could you please give me this link? I just rechecked all Firefox links - they work fine. I guess it could be a link to Google Groups, that does not allow unauthenticated access or something like that. But I will wait for a bit more information for debugging this issue.

For instance, it took me a while to find instructions to build our project with PGO and Clang.

Thanks! I will try to fix this.

Knowing that the performance win is not guaranteed, it seems like a risky project to take on.

Definitely is not guaranteed. But at least according to my tests, PGO shows improvements in all tested by me compiler-like workloads (static analysis, compilers, LSP implementations, even formatters). So you can estimate some performance wins based on the benchmarks for other projects.

NB: I can't help but notice that you opened a similar issue on more than 90 repositories. That feels a bit spammy. Not sure if there was a better way to do it.

Me too. Honestly, I do not think that it's "bad" spamming or something like that. I see many cases (like one, two, three) where people actually thanks for raising the optimization opportunity.

So I decided just speak with project maintainers via issues. Some are interested in PGO and we are discussing it further, some are not interested in performance (and it's completely ok - every project on every lifecycle stage has own priorities) and close the PGO-related issues. Also, via issues I collect feedback about my results (like you did with reporting for difficult navigation over the repo).

arthaud commented 1 year ago

some are broken for me, for instance the one for Firefox results

Could you please give me this link? I just rechecked all Firefox links - they work fine. I guess it could be a link to Google Groups, that does not allow unauthenticated access or something like that. But I will wait for a bit more information for debugging this issue.

oh right, it's just because we have some kind of corporate Google account which doesn't allow access to Google groups. Link I was talking about: https://groups.google.com/g/mozilla.dev.platform/c/wwO48xXFx0A/m/ztg4i0DYAAAJ

zamazan4ik commented 1 year ago

Got it. I will extract the results directly to the repo for this case. Thanks!