envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25.11k stars 4.82k forks source link

Revisit default compiler flags. #4159

Open PiotrSikora opened 6 years ago

PiotrSikora commented 6 years ago

Stack guards, LTO/ThinLTO, debug symbols, etc.

cc @mattklein123 @htuch @alyssawilk @lizan

derekargueta commented 5 years ago

I've been eyeing -Wsign-conversion but there are a lot of cases of comparing signed and unsigned values. Would this flag be of interest to add to the set of warnings/errors?

htuch commented 5 years ago

@derekargueta curious what the rationale in prioritizing this flag is. No objections at my end, just wondering if there are other warnings that might be higher signal.

moderation commented 2 years ago

Saw this article about a 7% performance improvement for Chrome by enabling ThinLTO - https://blog.chromium.org/2022/03/a-new-speed-milestone-for-chrome.html

mattklein123 commented 2 years ago

I was thinking about this recently and I think we should do LTO by default. I honestly don't recall why we haven't done it yet.

alyssawilk commented 2 years ago

cc @wbpcode

mattklein123 commented 2 years ago

cc @keith who I've been talking about this offline. I think he is looking into enabling thin LTO now.

alyssawilk commented 2 years ago

Oh excellent! Can we call it out when it lands in Envoy Mobile? I want to call it out to Stan and see what it does to their A/B experiments. cc @RyanTheOptimist for visibility.

keith commented 2 years ago

There's currently an issue with antlr that causes this to not work, we're hoping that's fixed in an upcoming release, but otherwise this command works on my linux machine (if I hack around that issue):

CC=clang CXX=clang++ bazel build envoy --features=thin_lto --ltobackendopt=-Wno-unused-command-line-argument --linkopt=-fuse-ld=lld
keith commented 2 years ago

It turns out if you disable wasm you should avoid all the issues with LTO, and produce an envoy binary with LTO successfully.

As far as envoy-mobile you can also use LTO for iOS / Android by passing --copt=-flto=thin --linkopt=-flto=thin (currently the thin_lto bazel feature isn't supported on these platforms). Android has an issue combining LTO with -Osize and in my tests -Osize is better for now, once we can upgrade past NDK 22 (which isn't currently supported by bazel) we can test this again and hopefully see some results. On iOS LTO didn't have an impact on a final dylib build, but I imagine it could if you linked into the consumer with LTO, which we're currently not doing at Lyft so I cannot easily test.

Next steps:

moderation commented 2 years ago

With @keith 's help I'm compiling on RBE with WASM disabled with the following added to ~/.bazelrc. I couldn't get this to work on M1 MacOS or aarch64 Raspberry Pi

# https://github.com/envoyproxy/envoy/issues/4159#issuecomment-1062001080
build --features=-per_object_debug_info
build --features=thin_lto
build --ltobackendopt=-Wno-unused-command-line-argument
keith commented 2 years ago

I've been testing on my M1 machine as with the manual flags instead of the feature, what errors are you seeing?

moderation commented 2 years ago

Sorry, my M1 does have the updated ~/.bazelrc flags and is working

PiotrSikora commented 2 years ago

It turns out if you disable wasm you should avoid all the issues with LTO, and produce an envoy binary with LTO successfully.

What's the issue with Wasm? "full" LTO (--copt=-flto --linkopt=-flto) builds fine for me, and Wasm tests are passing.

keith commented 2 years ago

I believe this library is only enabled with wasm?

ld.lld: error: cannot open bazel-out/k8-fastbuild/bin/source/exe/envoy-static.lto/external/com_google_cel_cpp/parser/_pic_objs/cel_cc_parser/cel_grammar/cel_grammar/CelLexer.pic.o: No such file or directory
ld.lld: error: cannot open bazel-out/k8-fastbuild/bin/source/exe/envoy-static.lto/external/com_google_cel_cpp/parser/_pic_objs/cel_cc_parser/cel_grammar/cel_grammar/CelParser.pic.o: No such file or directory

Some more convo about this: https://envoyproxy.slack.com/archives/C78HA81DH/p1646697338997329?thread_ts=1646697267.395829&cid=C78HA81DH

PiotrSikora commented 2 years ago

envoy-static.lto suggests that you might be doing something else than simply adding --copt=-flto --linkopt=-flto?

LTO build works fine for me on origin/main with both libstdc++ and libc++ using Clang 13.0.1:

$ bazel build --config=clang --copt=-flto --linkopt=-flto //source/exe:envoy-static
$ bazel build --config=libc++ --copt=-flto --linkopt=-flto //source/exe:envoy-static

As for com_google_cel_cpp - it's used in Wasm, but not only there, so blaming this on Wasm might not be 100% accurate:

$ rg com_google_cel_cpp -l source
source/extensions/common/wasm/BUILD
source/extensions/access_loggers/filters/cel/BUILD
source/extensions/rate_limit_descriptors/expr/BUILD
source/extensions/filters/common/expr/BUILD
moderation commented 2 years ago

What is the difference between @keith 's

--features=-per_object_debug_info
--features=thin_lto
--ltobackendopt=-Wno-unused-command-line-argument

and @PiotrSikora 's

--copt=-flto
--linkopt=-flto
PiotrSikora commented 2 years ago

ThinLTO vs "full" LTO. The latter should be better, but requires more memory at link-time. See: http://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html.

keith commented 2 years ago

I didn't even try full LTO in general, so maybe that could be the difference in the outputs? But the bazel thin_lto feature does what I assume is the most optimized version of LTO for distributed builds https://github.com/bazelbuild/bazel/blob/ccf41d94f4a4179641ff4296e0845c70a6bc4c5f/tools/cpp/unix_cc_toolchain_config.bzl#L1123-L1184

mattklein123 commented 2 years ago

FWIW the reason I had suggested thin LTO is I assumed that we would have build failures trying to do full LTO given our size, but that was just speculation.

keith commented 2 years ago

Yea I think they're supposed to be close enough that I think we'd use thin, and I imagine the issue was something else

moderation commented 2 years ago

FWIW I tried the full RTO in the RBE clusters and it failed with what looked like a Clang 12.0.1 version error. If this is the case maybe we could bump RBE Clang to 13 (I think I've seen this discussed somewhere).