RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.34k stars 1.27k forks source link

Moderinize ASan configurations for newer Bazel #17268

Open jwnimmer-tri opened 2 years ago

jwnimmer-tri commented 2 years ago

At the time we added our ASan, TSan, UBSan builds we needed to roll our own configuration, because it wasn't built-in to Bazel. As of today, Bazel does support it natively.

For example, to enable ASan in the compiler toolchain, use --features=asan. Our patch would be along these lines:

--- a/tools/dynamic_analysis/bazel.rc
+++ b/tools/dynamic_analysis/bazel.rc
@@ -32,16 +32,14 @@ build:kcov_everything --nocache_test_results
 build:kcov_everything --test_timeout=600,3000,9000,36000

 ### ASan build. Clang only. ###
+build:asan --features=asan
 build:asan --build_tests_only
-build:asan --copt=-g
 # https://github.com/google/sanitizers/wiki/AddressSanitizer#faq
 build:asan --copt=-fno-common
-build:asan --copt=-fsanitize=address
 build:asan --copt=-fsanitize-address-use-after-scope
 build:asan --copt=-fstandalone-debug
 build:asan --copt=-O0
 build:asan --copt=-fno-omit-frame-pointer
-build:asan --linkopt=-fsanitize=address
 build:asan --linkopt=-fsanitize-address-use-after-scope
 # ld.gold: warning: Cannot export local symbol __asan_extra_spill_area
 build:asan --linkopt=-fuse-ld=ld

The trick here is figuring out which copts Bazel already includes as part of their feature, and which ones we still need to add in manually.

A good way to do that will be to run a build with -s -j 1 to dump the current command line, and then compare with the --features=asan command lines. Hopefully Ubuntu and macOS are on par, but we should check both.


Once this issue is closed, we can revert #17263 to add more warning checks.

svenevs commented 2 years ago

CC @xuchenhan-tri #17405 may not have had the desired outcome. I noticed that the sanitizer builds (locally and in CI) produce a large number of warnings about clang: warning: argument unused during compilation: '-fsanitize-address-use-after-scope' [-Wunused-command-line-argument]. My impression is the address sanitizer isn't actually getting used now. I snagged the known failing example and injected it into a random test

diff --git a/math/test/normalize_vector_test.cc b/math/test/normalize_vector_test.cc
index bd6f798eab..b5e6556a04 100644
--- a/math/test/normalize_vector_test.cc
+++ b/math/test/normalize_vector_test.cc
@@ -34,9 +34,17 @@ void NormalizeVectorTestFun(const Eigen::Matrix<double, nx, 1>& x) {
       1E-10, MatrixCompareType::absolute));
 }

+volatile int *p = 0;
+
 GTEST_TEST(NormalizeVectorTest, NormalizeVector) {
   NormalizeVectorTestFun<2>(Eigen::Vector2d(1.0, 0.0));
   NormalizeVectorTestFun<3>(Eigen::Vector3d(1.0, 2.0, 3.0));
+
+  {
+    int x = 0;
+    p = &x;
+  }
+  *p = 5;
 }
 }  // namespace
 }  // namespace math

bazel test --config=clang --compilation_mode=dbg --config=asan -s -j 1 //math:normalize_vector_test will report success. If I restore the old flags:

diff --git a/tools/dynamic_analysis/bazel.rc b/tools/dynamic_analysis/bazel.rc
index 7fff42dd82..671e1254c4 100644
--- a/tools/dynamic_analysis/bazel.rc
+++ b/tools/dynamic_analysis/bazel.rc
@@ -32,7 +32,10 @@ build:kcov_everything --nocache_test_results
 build:kcov_everything --test_timeout=600,3000,9000,36000

 ### ASan build. Clang only. ###
-build:asan --features=asan
+# build:asan --features=asan
+build:asan --copt=-g
+build:asan --copt=-fsanitize=address
+build:asan --linkopt=-fsanitize=address
 build:asan --build_tests_only
 # https://github.com/google/sanitizers/wiki/AddressSanitizer#faq
 build:asan --copt=-fno-common

then the test will correctly fail. Given impending release I'm opening a revert PR #17422 just to be able to launch some jobs to run the sanitizers on and see if anything comes up that we aren't catching as a result.

xuchenhan-tri commented 2 years ago

Although I'm not sure what's the right way to make progress on this issue, #17394 is (potentially) blocked by this. So I'd welcome a bump in priority on this issue.

svenevs commented 2 years ago

Yes, #17394 got reverted again in #17422 :confused: However, it wasn't ubsan it was tsan. I did not take a very close look but the related job failures are:

all of which failed to link, the gurobi / mosek ones provide a little more information:

[8:41:43 AM]  /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/locale_facets.h:874: error: undefined reference to '__tsan_read1'
[8:41:43 AM]  /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/locale_facets.h:875: error: undefined reference to '__tsan_read1'
[8:41:43 AM]  /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/char_traits.h:300: error: undefined reference to '__tsan_read1'
[8:41:43 AM]  /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/typeinfo:123: error: undefined reference to '__tsan_read1'
[8:41:43 AM]  clang: error: linker command failed with exit code 1 (use -v to see invocation)

seems likely that it's not that far off from being fixed

jwnimmer-tri commented 2 years ago

My best guess is that the fix for the linking problem will be the same whether or not we've "modernized" the rcfiles or not. I don't think this issue is on the critical path for that fix.