RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.35k stars 1.27k forks source link

Leak Sanitizer Segfaulting in CI #22215

Open wernerpe opened 2 days ago

wernerpe commented 2 days ago

What happened?

We found confusing behavior when looking into one of the tests run in CI that ran the leak sanitation. We found that in the planning directory, the visibility graph test and the iris zo test (#22168 ) would fail sporadically in CI when running 'linux-jammy-clang-bazel-experimental-leak-sanitizer'. To reproduce the error locally we ran something like

bazel test --runs_per_test=10 --config=clang --compilation_mode=dbg --config=lsan //planning:visibility_graph_test

on ubuntu 22.04 and found typically 2-3/10 runs would produce the segfault. From what we could tell, the segfault gets tripped before entering the test body and only when more than a single thread was requested.

The commit sha i have added below points to a commit on my fork of drake (from which I have opened the pr #22168 ).

Version

34437bc4957b96eb7ebfdf5421646622c3aa7d56

What operating system are you using?

Ubuntu 22.04

What installation option are you using?

No response

Relevant log output

No response

calderpg-tri commented 2 days ago

Some additional information:

calderpg-tri commented 2 days ago

I was able to reproduce the segfault on 24.04 using clang-15. Switching to clang-18 on 24.04, I was unable to reproduce the segfault in 1000 runs of the test. I am inclined to say this is a LSAN bug.