abseil / abseil-cpp

Abseil Common Libraries (C++)
https://abseil.io
Apache License 2.0
14.76k stars 2.58k forks source link

SymbolizeStackConsumption fails when run alone #1012

Open AlexK-BD opened 3 years ago

AlexK-BD commented 3 years ago

Describe the bug

In some builds (see below), running symbolize_test with --gtest_filter=Symbolize.SymbolizeStackConsumption causes that test to fail, e.g.

[ RUN      ] Symbolize.SymbolizeStackConsumption
/home/BOSDYN/akhripin/bdi/3rdparty_src/abseil-cpp/absl/debugging/symbolize_test.cc:253: Failure
Expected: (stack_consumed) < (GetStackConsumptionUpperLimit()), actual: 2272 vs 2048
/home/BOSDYN/akhripin/bdi/3rdparty_src/abseil-cpp/absl/debugging/symbolize_test.cc:261: Failure
Expected: (stack_consumed) < (GetStackConsumptionUpperLimit()), actual: 2272 vs 2048
[  FAILED  ] Symbolize.SymbolizeStackConsumption (0 ms)

Steps to reproduce the bug

I have reproduced this in several Linux x86_64 builds on my ubuntu 18.04 machine. Notably, the bazel build without -O2 does work.

cmake

cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=ON -DABSL_USE_GOOGLETEST_HEAD=ON ..
make -j32
./bin/absl_symbolize_test # PASSES
./bin/absl_symbolize_test --gtest_filter=Symbolize.SymbolizeStackConsumption # FAILS

bazel with -O2

I'm very new to bazel, but:

bazel test  --copt=-O2   //absl/debugging:symbolize_test
bazel-out/k8-fastbuild/bin/absl/debugging/symbolize_test # PASSES
bazel-out/k8-fastbuild/bin/absl/debugging/symbolize_test --gtest_filter=Symbolize.SymbolizeStackConsumption # FAILS

This uses gcc 7 by default . I've tried gcc 8, clang 8, and clang 10. The amount of stack violation varies, but all of them exhibit the problem. Variations like -O3 and -Os, architecture switches, etc, change the exact max stack size, but all exhibit the same problem.

This also fails: env CC=g++-8 bazel test --copt=-O1 --copt=-fcaller-saves //absl/debugging:symbolize_test

What version of Abseil are you using?

4bb9e39c88854dbf466688177257d11810719853 (master as of a few hours ago. Problem also manifested with code from Jul 2 (58e042da9210710dc4ac3b320e48b54e2449521e)

What operating system and version are you using

Ubuntu 18.04, x86_64

What compiler and version are you using?

Default gcc:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

Other compilers tested: gcc 8, clang 8, clang 10.

What build system are you using?

bazel 4.0.0 and cmake version 3.10.2

Additional context

I came across this because I integrated abseil (tests and all) into our gn-based build system. Our parallel test framework runs test cases from each gtest one at a time (each in a separate process).

Running both stack consumption tests in one process also fails on both (--gtest_filter=Symbolize\*Stack*)

Other interesting multi-test cases:

[----------] Global test environment tear-down [==========] 3 tests from 1 test suite ran. (1 ms total) [ PASSED ] 2 tests. [ FAILED ] 1 test, listed below: [ FAILED ] Symbolize.SymbolizeWithDemanglingStackConsumption



One of the stack consumption tests passes, but the other uses way more stack than any other observed failure (3696 bytes; others have been 2100-2200)
AlexK-BD commented 3 years ago

I used some ham-handed instrumentation to see where the stack was growing.

The extreme case (running bazel-out/k8-fastbuild/bin/absl/debugging/symbolize_test --gtest_filter=Symbolize\*Stack*:\*Cache\*)

is due to the snprintf call in ReadAddrMap (which uses a lot of stack)

In the normal (all gtests case), by the time the stack consumption tests run, FindSymbolInCache succeeds and almost none of Symbolizer::GetSymbol runs as a result. When the tests are run in isolation, FindSymbolInCache returns null and the rest of GetSymbol runs.