RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.32k stars 1.26k forks source link

pydrake bazel: How to get a stacktrace in debug mode? #14381

Open EricCousineau-TRI opened 3 years ago

EricCousineau-TRI commented 3 years ago

@sloretz is trying to debug some stuff here: https://github.com/sloretz/drake_ros2_demos/pull/3

I thought #14356 would've fixed it, but possibly not.

He's trying to debug, but can't easily find the point in Drake C++ / pybind11 code where things go awry.

For now, he's trying out the @traced macro from here: https://drake.mit.edu/python_bindings.html#debugging-with-the-python-bindings

I am assuming the our cmake -DCMAKE_BUILD_TYPE=Debug command isn't doing anything more special than -c dbg for Bazel; but we shall see...

EricCousineau-TRI commented 3 years ago

See example trying to get a stack trace in #14380

Commit: b99ff10d6b Command:

cd drake
bazel run -c dbg //:install -- ${PWD}/build
python_path=$(ls -d ~+/build/lib/python*/site-packages)
env -i PYTHONPATH=${python_path} /usr/bin/python3 -c 'import pydrake.common._module_py._testing as m; m.trigger_segfault()'

Triggers a segfault. coredumpctl gdb doesn't seem to turn up anything useful (says there's no stacktrace).

Running it under GDB also doesn't seem to yield useful symbols:

$ env -i PYTHONPATH=${python_path} gdb -ex run -ex backtrace --args /usr/bin/python3 -c 'import pydrake.common._module_py._testing as m; m.trigger_segfault()'
...
0x00007ffff21c09d7 in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#0  0x00007ffff21c09d7 in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#1  0x00007ffff21c662c in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#2  0x00007ffff21c5705 in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#3  0x00007ffff21c4683 in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#4  0x00007ffff21c46e0 in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#5  0x00007ffff21d4b0e in ?? () from ${PWD}/build/lib/python3.6/site-packages/pydrake/common/_module_py.so
#6  0x000000000050a4a5 in _PyCFunction_FastCallDict (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>,
    func_obj=<built-in method trigger_segfault of PyCapsule object at remote 0x7ffff242d4e0>) at ../Objects/methodobject.c:231
#7  _PyCFunction_FastCallKeywords (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>) at ../Objects/methodobject.c:294
#8  call_function.lto_priv () at ../Python/ceval.c:4851
...
EricCousineau-TRI commented 3 years ago

Using python3-dbg doesn't seem to turn up anything more fruitful in terms of Drake. (Python, however, shows it all)

EricCousineau-TRI commented 3 years ago

Naively trying out CMake per our docs:

cd drake
mkdir build_cmake && cd build_cmake
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=${PWD}/install

EDIT: Success! That seems to do it:

$ python_path=$(ls -d ~+/build_cmake/install/lib/python*/site-packages)
$ env -i PYTHONPATH=${python_path} gdb -ex run -ex backtrace --args /usr/bin/python3 -c 'import pydrake.common._module_py._testing as m; m.trigger_segfault()'
...
0x00007ffff21c09d7 in drake::pydrake::(anonymous namespace)::testing::<lambda()>::operator()(void) const (__closure=0x10775b8) at bindings/pydrake/common/module_py.cc:91
91          *value = 0xbadf00d;
#0  0x00007ffff21c09d7 in drake::pydrake::(anonymous namespace)::testing::<lambda()>::operator()(void) const (__closure=0x10775b8) at bindings/pydrake/common/module_py.cc:91
#1  0x00007ffff21c662c in pybind11::detail::argument_loader<>::call_impl<void, drake::pydrake::(anonymous namespace)::testing::def_testing(pybind11::module)::<lambda()>&, pybind11::detail::void_type>(drake::pydrake::(anonymous namespace)::testing::<lambda()> &, std::index_sequence, pybind11::detail::void_type &&) (this=0x7fffffffe416, f=...)
    at external/pybind11/include/pybind11/cast.h:2302
#2  0x00007ffff21c5705 in pybind11::detail::argument_loader<>::call<void, pybind11::detail::void_type, drake::pydrake::(anonymous namespace)::testing::def_testing(pybind11::module)::<lambda()>&>(drake::pydrake::(anonymous namespace)::testing::<lambda()> &) (this=0x7fffffffe416, f=...) at external/pybind11/include/pybind11/cast.h:2279
#3  0x00007ffff21c4683 in pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::operator()(pybind11::detail::function_call &) const (__closure=0x0, call=...)
    at external/pybind11/include/pybind11/pybind11.h:180
#4  0x00007ffff21c46e0 in pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::_FUN(pybind11::detail::function_call &) ()
    at external/pybind11/include/pybind11/pybind11.h:158
#5  0x00007ffff21d4b0e in pybind11::cpp_function::dispatcher (self=<PyCapsule at remote 0x7ffff242d4e0>, args_in=(), kwargs_in=0x0) at external/pybind11/include/pybind11/pybind11.h:654
#6  0x000000000050a4a5 in _PyCFunction_FastCallDict (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>,
    func_obj=<built-in method trigger_segfault of PyCapsule object at remote 0x7ffff242d4e0>) at ../Objects/methodobject.c:231
#7  _PyCFunction_FastCallKeywords (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>) at ../Objects/methodobject.c:294
#8  call_function.lto_priv () at ../Python/ceval.c:4851
EricCousineau-TRI commented 3 years ago

From my comment in Slack:

The killer combo with debugging is:

EricCousineau-TRI commented 3 years ago

I (or someone else ;) should PR that to Drake docs.

EricCousineau-TRI commented 3 years ago

Efff... I was the one who sprinkled that sauce in :facepalm: https://github.com/RobotLocomotion/drake/commit/ea1b198a7cf52e737dde20b43ae95bf7d9ed4844 (#10103)

TL;DR: Use bazel run //:install -- --no_strip for devs.

pathammer commented 1 year ago

@EricCousineau-TRI Is there any way to intercept the Bazel strip option which seems a lot more intuitive?

EricCousineau-TRI commented 1 year ago

Yup, good idea! Perhaps there's a way to extract that from cc toolchain information. Unfortunately, though, it's unlikely I will have time to focus on it anytime soon.

Just to check, is this something that impacted your workflow?

pathammer commented 1 year ago

I lost a day trying to build libdrake with debug symbols before seeing this --no_strip option on the installer 😥. I can file an improvement ticket?

EricCousineau-TRI commented 1 year ago

Ah, sorry to hear that, and yup, that'd be best!

richardrl commented 10 months ago

From my comment in Slack:

The killer combo with debugging is:

from pydrake.common import set_log_level
set_log_level("trace")

Then it should be obvious how things playout

This doesn't work anymore. I get "No stack" in gdb.

I did the following:

bazel run -j 10 --config=mosek --config=gurobi //:install -- ~/drake-build --no_strip
gdb -ex run -ex backtrace --args python3 gcs_test6_fast4.py <my script args>

Is it because set_log_level is no longer present in current drake? What is the equivalent of set_log_level("trace") now?