How can I use this to debug Rcpp code?

jonlachmann commented 1 year ago

I have sometimes had issues with Rcpp code that causes segfaults etc. and have managed to solve it using this docker image. I am however wondering how it is intended to be used?

My approach has been as in this gist: https://gist.github.com/jonlachmann/890a51471e7b43b1b9eea58b9cf65eda which is quite cumbersome. Is there a more convenient way of doing it?

eddelbuettel commented 1 year ago

Maybe this a better question for the rcpp-devel mailing list ? Which is pretty friendly, low-volume and the general place to ask about 'how to develop with Rcpp'. You need to subscribe in order to post (standard spam problem).

Your write-up does not strike me as 'cumbersome'. You work with a container, the container is setup to use the (instrumented) RD build of R so you use that. (You write about needing devtools but do not show it being used; I also happen to work without as base R functions like install.packages() etc do the job for me.) I am a bit surprised about the LD_PRELOAD but maybe it is needed.

The valgrind tool is also quite capable, and does not require R to be instrumented. I alsos recently found a bug of writing outside array bounds in C code because gcc-12 / g++-12 noticed it during compilation without any (A)SAN mode which was rather impressive.

Anyway, if your bug is fixed the container did its job so all is good?

jonlachmann commented 1 year ago

The main thing I am missing is probably documentation. I never use docker except for this, and am quite unfamiliar with address santizer etc. hence my gist to remember how I got it to work the last time. To me it is even unclear if this image is intended to be used this way?

Having a piece of Rcpp code that writes outside the memory it seems like a quite common use case. And if this piece of code can only be triggered to do this inside its package devtools etc. is needed to build and cause the error to happen.

I remember spending many hours figuring out I needed the LD_PRELOAD to get anything at all to work. Also the Makevars required to have the packages build in a way that allows ASAN to work also took a while to figure out.

Would perhaps a fork with a more stout build of this container with devtools etc. already installed be appropriate for my use case? I am at the moment running all the installations again as I did not remember to save the state of the docker with everything installed.

You are correct in that the container did its job, my goal here is to figure out if there is a more clever way of doing what I am doing. And also spare others from the many hours spent getting everything up and running.

eddelbuettel commented 1 year ago

Docker can be a little intimidating but is by now quite well documented in many tutorials available online. This repo is here just one of many I provide with Docker as a tool, it would not be the place to repeat somewhat generic documention for Docker.

In a nutshell, and while you will find this documented more extensively elsewhere you can

Create your own Dockerfile deriving from this one via the FROM line and then "merely" add another RUN line to call R or Rscript to install a set of R packages. (You could also use apt to get them premade for the Linux distribution the container is based on.) You then create a local container via docker build, and you could then upload it to a container registry if you wanted to.
Even quicker is to modify the Docker container while it is running (by installing packages etc): in another shell, do docker ps to see what identifier your current Docker session has and then use that in docker commit to store the modified container under a new tag.

(I still do not believe devtools is "needed" to trigger anything as devtools is an optional package, but it may of course be integral to your workflow so that that is effectively equivalent for you. But I don't use devtools for package development which is why the package is not in the container.)

(I reiterate my recommendation for valgrind. There are lighterweight alternatives if you find Docker too heavy.)

jonlachmann commented 1 year ago

Yes, I am learning bit by bit!

You are probably correct in that I could adapt my workflow to not use devtools. I am getting curious as to how you are using this container to run the address sanitizer, do you have a complete example? I guess I am just getting frustrated since there seems to be a well thought out idea as to how this should be used, but I am not getting it. - You could say that this issue is just a request for a few lines of documentation on how this is supposed to be used.

Thanks, I think that last time I tried it I had limited success (may have been since it did not want to run on the computer I was on or similar).

eddelbuettel commented 1 year ago

Yes, it takes stamina. You'll get there.

I used to always just use R CMD INSTALL ... and friends but (many years ago) started writing myself some "little" helper scripts based on the littler package. See https://github.com/eddelbuettel/littler/tree/master/inst/examples, I use build.r and install.r (and variants) all the time ... Anyway.

I think when I was chasing bugs with ASAN I got them just by doing RD CMD check somepackage_1.2.3.tar.gz. But it's been a while.

Many of us also rely on (much larger, but also much more complete with more builds) container by Winston: https://github.com/wch/r-debug

jonlachmann commented 1 year ago

Thank you, this does indeed shine a bit more light into how this is intended to be used! If you want to put some of this information accessible as a documentation for this image that would be great!

I will look into how to make an extended build to avoid having to spend so much time every 6 months when I need it.

rocker-org / r-devel-san

How can I use this to debug Rcpp code? #6