An R package with functions for debuggers like 'gdb' to inspect R and Rcpp specific types in C and C++ based code called from R
This packages provides functions that can be called from a debugger (eg. gdb) to easily inspect (print) the content of R and Rcpp specific data types like variables or environments on the console when debugging C or C++ called called from R. This can be done without modifying the source code of the debugged code.
This package is experimental and meant as a prove of concept
Initiated in Oct, 2019 after some questions and discussions at Stackoverflow regarding debugging of Rcpp
code with gdb
Currently work in (slow or no ;-) progress... Nothings works. A lot of ideas are collected in the RESEARCH_NOTES.txt...
Debugging C and C++ code called from R using a debugger like gdb
is often a pain because it is difficult to
inspect the R and Rcpp
specific data types (like variables or environments) with a debugger that
does not know the internals of these data types.
This package is an attempt to improve this situation.
TODO
Functional requirements:
Inspect variable values and structures of R
and Rcpp
variables during debugging
Debugging support for C++ code
Optional: Debugging support for plain C code
Optional: Support for modifying variable values (to try to find bug fixes during debugging)
Non-functional requirements:
No need to recompile R on Windows for debugging (unless you want to debug R itself and are resistent against build headaches ;-)
Minimize preparation efforts for debugging
main.cpp
with Rinside
)Easy-to-use debugging helper functions
gdb
decide which function to call instead
of forcing the user to find the right function name by knowing the data typeSupport for at least gdb
and optionally the LLDB
debugger
You can call the debug functions dbg_*
in gdb
via the call
command, eg.
(gdb) call dbg_ls()
(gdb) call dbg_str("myVar_in_global_env")
(gdb) call dbg_attributes(x)
(gdb) call dbg_print(x)
(gdb) call dbg_print(dbg_table(y))
...
The supported combination of functions and the data type of the first argument are marked with an x
in the cell.
For non-obvious cases the meaning of the first argument for the function is described in the cell.
The last column (...
) contains names of functions specialized for the data type.
TODO: Add row to show signatures to make this table more cheat-sheet alike. SEXP should always return an SEXP. Show example calls...
Input Data Type | dbg_ls() | dbg_str() | dbg_print() | dbg_attributes() | dbg_table() | dbg_get() | dbg_subset() | Others... |
---|---|---|---|---|---|---|---|---|
Function description | List objects | Print object structure | Print object value | Print attributes | Create contingency table | Find object in env | Filter objects | |
Corresponding R function | ls() |
str() |
print() |
attributes() |
table() |
get() |
myVar[begin:end] |
|
Return type | IntegerVector of class "table" | SEXP | same type as input object | |||||
Side effects | ||||||||
char * | object name (in global env) | object name (in global env) | object name (in global env) | object name (in global env) | dbg_as_std_string(x) | |||
std::string | TODO (semantics?) | |||||||
Rcpp::String | prints the string content | |||||||
SEXP | x | x | x | x | x | |||
Environment | print names of all objects in the env | str() of an object in the env | prints names of all objects in the env | prints attributes of an object in the env | env to search a named object | get object from the env | ||
ComplexVector | x | x | x | (not supported in R) | x | |||
DataFrame | x | x | x | (not supported) | x (filters rows like R: df[begin:end, ] ) |
|||
IntegerVector | x | x | x | x | x | |||
LogicalVector | x | x | x | x | x | |||
NumericVector | x | x | x | x | x | |||
DoubleVector | x | x | x | x | x | |||
RawVector | x | x | x | (not supported in R) | x | |||
CharacterVector | x | x | x | x | x | |||
StringVector | x | x | x | x | x | |||
ExpressionVector | x | x | x | (not supported in R) | x | |||
GenericVector | x | x | x | (not supported in Rcpp) | x | |||
Rcpp::List | x | x | x | (not supported by Rcpp) | x | |||
DateVector | x | x | x | x | x | |||
DatetimeVector | x | x | x | x | x | |||
... Matrices... |
Note: This table was initially created using http://www.tablesgenerator.com/markdown_tables
To debug your own package or R code that uses C/C++ (eg. via Rcpp
) you have
to follow these steps:
# install.packages("devtools")
devtools::install_github("aryoda/CppDebugHelper")
Build your package or C/C++ library called from R with debugging information
For R packages modify the Makevars
file via usethis::edit_r_makevars()
and add (or edit) the line CXXFLAGS = -g3 -O0 -Wall
(for Linux only).
For Windows you have to add CXXFLAGS = -g3 -std=c++11
. Save the file.
Don't forget to remove or comment the line later or you may slow down your R or newly installed packages!
Clean and build your package (or C/C++ libary)
Open a command shell ("terminal") and cd into the location where your C/C++ source code was compiled
If you have multiple locations you can add more locations in gdb
later via
the directory
command (see help directory
in gdb
).
Start the debugger:
# open a command shell ("terminal")
# on Linux use:
R -d gdb
# on Windows use
gdb /path/to/R-3.x.x/bin/x64/Rgui.exe
Note:
All examples here are based on gdb
.
lldb
does not work so far (for the current status see issue #5),
but perhaps you are the lucky one ;-)
You can use this GDB to LLDB command map
to "translate" the example gdb
commands to lldb
.
Start R in gdb
(gdb) run # or short: just an "r"
The R command prompt appears.
Load the CppDebugHelper
package to "inject" the debug functions
library(CppDebugHelper)
Load all your packages and libraries to be able to set breakpoints in gdb
The R package to be debugged must be loaded via library
.
C/C++ libraries you are calling directly via .C
or .Call
must be loaded
via dyn.load
.
Interrupt R to put you back to the gdb
debugger to set breakpoints
RGui
select the Misc > Break to debugger menu itemIn gdb
set breakpoints, eg. in the CppDebugHelper
test function:
TODO
TODO
gdb
(incl. test functions as a separate R package to learn debugging)Offer public C/C++-level functions to
dbg_print
): inspect Rcpp's Vector, Matrix and List data typesdbg_print
): inspect SEXP types (native R)dbg_print(Environment, varname)
): inspect a variable by name (along the search path)dbg_print
): inspect data.frame
s (eg. via something like head
to print the first and last n rows)dbg_attributes
): support simple attribute queries like
names
attribute?)dbg_ls
): list all variables in an environment (default: global)dbg_ls
): inspect R variable in an environment (default: global)dbg_get
): get
function that returns a variable from an environment to be used for filteringdbg_subset
): simple vector filtering for important R and Rcpp data types (element range with range checks) (idea: as piped functions)dbg_subset
): subset data.frame
rowsdata.frame
columnsdbg_str
): inspect the str()
of an R variabledbg_str
): inspect the str()
of Rcpp data typesdbg_table
): get tabulation (table
in R) with limited output (may be quite chatty)dbg_syscalls
) print sys.calls (also call Rf_PrintValue(R_GetTraceback(0))
?)summary
in R; Rcpp knows "only" table
) (idea: as piped functions)head
and tail
(as piped functions or just for direct printing? Printing should be the main goal IMHO...)options("max.print")
to reduce print results to a sensible maximum (default is 99999!)dbg_assign()
)Rcpp
(eg. RcppArmadillo
)next
could be presented as buttons and with keyboard shortcuts.
Breakpoints could be set in the code as usual and "translated" into gdb
breakpoints automatically...Note: Each print function should respect getOption("max.print")
and cut the output
with [ reached getOption("max.print") -- omitted 9000 entries ]
For
A function for automatic installation or at least instructions would be helpful.
TODO
Does this package also support debugging of plain C code if Rcpp
is not used at all?
R
is implemented in C and this packages uses function overloading which is not
a supported feature in the C language (AFAIK).
Verify that lldb
can also be used (see FAQ entry for that)
Explain why this package delivers CPP functions instead of a gdb
pretty printers
(short answer: this pkg shall be debugger independent but pretty printers are debugger specific
Explain when compilers generate the code from template definitions.
Since the compiler generates the code from the template definition, it means that the full definitions need to be visible to the calling code, not only the declaration, as was the case for functions and classes.
=> Rcpp always contains the template definitions in the header files (I guess -> check it)
So gdb
cannot
std::string
is not a primitive data type, it cannot be changed using (gdb) set var myvar="asdf"
but by calling its assign
member function.See the gdb documentation of C++ Expressions for what is possible and which limitations apply:
gdb
regarding the R
and Rcpp
APIYou can call most of the dbg_*
functions only from within your C/C++ code (when a breakpoint is hit)
but not by interrupting the R main loop (pressing Strg+C at the R command prompt) since this leads
to strange gdb
error messages when doing (gdb) call dbg_ls()
or (gdb) call dbg_str("myVar")
for example.
Local R/Rcpp variables exist and can be used to call dbg_*
functions before they are initialized.
This may cause segfaults (of course):
(gdb) n
19 Environment e = Rcpp::Environment::global_env();
(gdb) p e
$7 = {
<Rcpp::PreserveStorage<Rcpp::Environment_Impl<Rcpp::PreserveStorage> >> = {
data = 0x7fffffffb940 },
<Rcpp::SlotProxyPolicy<Rcpp::Environment_Impl<Rcpp::PreserveStorage> >> = {<No data fields>},
<Rcpp::AttributeProxyPolicy<Rcpp::Environment_Impl<Rcpp::PreserveStorage> >> = {<No data fields>},
<Rcpp::RObjectMethods<Rcpp::Environment_Impl<Rcpp::PreserveStorage> >> = {<No data fields>},
<Rcpp::BindingPolicy<Rcpp::Environment_Impl<Rcpp::PreserveStorage> >> = {<No data fields>}, <No data fields>}
(gdb) call dbg_print(e)
8
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7836a80 in getAttrib0 (vec=0x7fffffffb940, name=0x55555576c9a8) at attrib.c:142
...
gdb
does not apply pagination for Rcpp::print()
output
gdb
's display
command does not work with the call
command (display call f()
is not allowed)
so printing each time the program stops into gdb
does not work
gdb
does not know the C++ bool constants true
and false
that could be used as an argument in function calls.
Luckily Rcpp
includes the R
API header file Boolean.h
which defines an enum for that
(typedef enum { FALSE = 0, TRUE /*, MAYBE */ } Rboolean;
) so you can use the constants TRUE
and FALSE
instead:
// CPP declaration
// void dbg_test(bool b);
(gdb) call dbg_test(true)
No symbol "true" in current context.
(gdb) call dbg_test(TRUE)
1
gdb
does not support function default values, eg.:
// CPP declaration
// void dbg_test(int a = 1, bool b = true, const char *name = "myname");
(gdb) call dbg_test()
Too few arguments in function call.
(gdb) call dbg_test(10)
Too few arguments in function call.
(gdb) call dbg_test(10, 1, "hello")
10-1-hello
Makevars
stored?# Where is my home folder?
path.expand("~")
# Where is R's default Makefile configuration?
file.path(R.home("etc"), "Makeconf")
[1] "/usr/lib/R/etc/Makeconf"
# In this file you will find the build flags (variables) like: PKG_CXXFLAGS, PKG_LIBS...
# Where is user Makevars file (which overrides the default make file generated by R)?
usethis::edit_r_makevars()
If you get errors or warnings like
the used CPP compiler uses an older C++ standard as default.
To build successfully you have to enable a newer C++ standard (at least C++11) in the Makevars
file:
CXXFLAGS = -g3 -O0 -Wall -std=c++11
# CXXFLAGS = -g3 -O0 -Wall -std=c++14 # or this for C++14
For details see:
R
has no -d
switch)?Precondition: The code under inspection has been compiled for debugging (TODO: add link to howto instructions)
On Linux you can debug an R script or R package with gdb
via R -d gdb
.
This start R and attaches gdb
as debugger.
The gdb
version delivered on Windows via the Rtools
does not support the -d
switch
(perhaps because you cannot send a signal to running processes to stop the execution for debugging
like Linux does eg. with the shortcut Ctrl+C).
But: To set a breakpoint in Windows DLL it must be loaded first and this requires R to be started.
Once you have started R you cannot pause R (with out-of-the-box Windows ways) to call the debugger.
The solution is to "debug" the RGui.exe
which has a built-in menu item Misc > Break to debugger.
This allows you the pause R and work in gdb
:
gdb /path/to/R/bin/x64/Rgui.exe
To make the source code visible in gdb
you also
directory
command in gdb
to add the path to your source codeFor details see the R for Windows FAQ
gdb
on Windows I see a lot of warnings containing attribute.cpp(92)\dwmapi.dll
You can see a lot of warning similar to this one:
windows\dwm\dwmapi\attribute.cpp(92)\dwmapi.dll!00007FFFAD51594E: (caller: 00007FFFABCA071A) ReturnHr(30) tid(10a0) 80070006 The handle is invalid.
This is a Windows 10 bug. It seems Microsoft has forgotten to disable tracing code before releasing this DLL.
You can ignore these warnings (even though it is annoying to be flooded with them).
For details see: https://social.msdn.microsoft.com/Forums/en-US/3a5a145a-c13d-4898-bb61-a5baadc9332f/why-am-i-getting-hundreds-of-weird-messages-in-debug-output-window
gdb
on Windows fails with error "cannot execute this command while the selected thread is running"Sometimes during debugging it is no longer possible to step (via the next
command) or continue the execution.
The error message is:
(gdb) c
Continuing.
Cannot execute this command while the selected thread is running.
This problem was observed with gdb --version
GNU gdb (GDB) 7.9.1 as installed with Rtools v3.5 and is not easily reproducible.
This seems to be bug but the exact reason is unclear:
gdb
when debugging with RR -d gdb --debugger-args=--quiet
gdb
?You can use the R argument --debugger-args=ARGS
where ARGS
are arguments to the debugger.
R -d gdb --debugger-args=--quiet
gdb
directly without R?The easiest way to load the debugging helper functions is via R's library(CppDebugHelper)
command
but if you want to debug an application that does use R directly you
can load the underlying shared library using the LD_PRELOAD
environment variable:
gdb myApp
(gdb) set environment LD_PRELOAD ./CppDebugHelper.so
(gdb) run
(gdb) # Press Ctrl+C to break R into gdb
(gdb) # verify that the library has been loaded by gdb
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007ffff7bbc040 0x00007ffff7bca2e0 Yes ./CppDebugHelper.so
...
(gdb) # The debug functions are now available, eg.
(gdb) ptype dbg_as_std_string("hello world")
See:
dbg_*
) in gdb
fails with: Attempt to take address of value not located in memory.When you call a debug function gdb
you will get an error message like
Attempt to take address of value not located in memory.
if an function argument is a variable that is not stored in memory but in a CPU register:
(gdb) call dbg_print(x)
Attempt to take address of value not located in memory.
This most probably occurs because you have enabled code optimization during compilation.
Check if in the Makevars
file the -O
flag is set to -O0
("optimization = zero").
Then clean-up the binaries of your code and recompile (Build > Clean & Rebuild in RStudio).
lldb
debugger instead of gdb
?lldb
is the default debugger in Xcode on macOS/OS X for C++ and typically used with the Clang
compiler.
All examples here are based on gdb
but it should (= not yet tested!) be possible to use lldb
instead of gdb
because this packages does not depend on any special debugger.
You can use this GDB to LLDB command map to "translate" the example gdb
commands to lldb
commands.
In practice it did not work so far (for the current status see issue #5).
GPL-3 (see file LICENSE)
gdb
gdb
does not know default arguments:
gdb
s pretty printers)lldb
lldb
works on a OS X/MacRcpp
Rcpp
R
API