xtensor-stack / xtensor-r

R bindings for xtensor
BSD 3-Clause "New" or "Revised" License
86 stars 15 forks source link

Revamp R memory model #68

Closed wolfv closed 5 years ago

wolfv commented 5 years ago

This PR contains two changes:

Unfortunately, Rcpp chose the rather generic update() name for the method that is called on the CRTP base class after the SEXP has changed. But I think this does not (yet) collide with any method that we have in xtensor.

@DavisVaughan I hope this fixes the memory problems you were seeing! :)

DavisVaughan commented 5 years ago

do you need to add the new rcpp_extensions.hpp to the cmake? https://github.com/QuantStack/xtensor-r/blob/159905b547568b498bd9d165fc19fb016a2a7c02/CMakeLists.txt#L42

wolfv commented 5 years ago

yeah! good catch. These changes should help to bring xtensor-r to the next level ... lot's of small improvements all over the place.

And when taking a rarray / rtensor as an argument by reference, we're not doing a copy (anymore) so you an operate right on the buffer! Should be a nice speed up for many cases -- the previous behavior was a bug but solving it involved diving a little deeper into Rcpp. Your benchmarks helped to spot these issues!

Btw. any chance you know how to force the R compiler to put -O3 at the end? I can add CPPFLAGS (or CXXFLAGS) in the ~/.R/Makevars file but it only puts the flag at the beginning of the command ... so it compiles with -O2 for me (and only uses SSE, not AVX) so for me, doing some simple operation is currently slightly slower than pure R. We need to close that gap, hehe.

DavisVaughan commented 5 years ago

when taking a rarray / rtensor as an argument by reference, we're not doing a copy (anymore)

This is good! I'm pretty sure I was noticing this happening when doing some memory tests. The profmem package helped me there.

Regarding putting -O3 at the end. This actually took me a bit to figure out. So in rray I have some package specific flags set in src/Makevars. (I have them slightly wrong currently):

PKG_CPPFLAGS = -I../inst/include
CXX_STD = CXX14

Generally, I think I would then add this to ~/.R/Makevars.

CXXFLAGS = -O3

When I install the package, -O3 actually wasnt showing up for me at all. Finally I found a comment in this stackoverflow answer suggesting that because I am setting CXX_STD = CXX14, I need to use:

CXX14FLAGS = -O3

After doing this and restarting R, I only see -O3 when I install the package.

I don't know what you were doing, but that seemed to fix it for me.

wolfv commented 5 years ago

@DavisVaughan just to let you know: your tip worked!

With the new memory stuff + -O3 -march=native & xsimd, a addition is (almost) as fast as the R version. More complicated operations will have much better speedups, of course!

DavisVaughan commented 5 years ago

@wolfv happy to hear it! looking forward to getting back on this a bit later in the week and running some speed/memory benchmarks and also to finish up that other PR.

wolfv commented 5 years ago

yeah – unfortunately the other PR will need a rebase now ... hope that's alright.

DavisVaughan commented 5 years ago

not a problem!