Closed wolfv closed 5 years ago
do you need to add the new rcpp_extensions.hpp
to the cmake? https://github.com/QuantStack/xtensor-r/blob/159905b547568b498bd9d165fc19fb016a2a7c02/CMakeLists.txt#L42
yeah! good catch. These changes should help to bring xtensor-r to the next level ... lot's of small improvements all over the place.
And when taking a rarray / rtensor as an argument by reference, we're not doing a copy (anymore) so you an operate right on the buffer! Should be a nice speed up for many cases -- the previous behavior was a bug but solving it involved diving a little deeper into Rcpp. Your benchmarks helped to spot these issues!
Btw. any chance you know how to force the R compiler to put -O3
at the end? I can add CPPFLAGS (or CXXFLAGS) in the ~/.R/Makevars
file but it only puts the flag at the beginning of the command ... so it compiles with -O2
for me (and only uses SSE, not AVX) so for me, doing some simple operation is currently slightly slower than pure R. We need to close that gap, hehe.
when taking a rarray / rtensor as an argument by reference, we're not doing a copy (anymore)
This is good! I'm pretty sure I was noticing this happening when doing some memory tests. The profmem
package helped me there.
Regarding putting -O3
at the end. This actually took me a bit to figure out. So in rray
I have some package specific flags set in src/Makevars
. (I have them slightly wrong currently):
PKG_CPPFLAGS = -I../inst/include
CXX_STD = CXX14
Generally, I think I would then add this to ~/.R/Makevars
.
CXXFLAGS = -O3
When I install the package, -O3
actually wasnt showing up for me at all. Finally I found a comment in this stackoverflow answer suggesting that because I am setting CXX_STD = CXX14
, I need to use:
CXX14FLAGS = -O3
After doing this and restarting R, I only see -O3
when I install the package.
I don't know what you were doing, but that seemed to fix it for me.
@DavisVaughan just to let you know: your tip worked!
With the new memory stuff + -O3 -march=native & xsimd, a addition is (almost) as fast as the R version. More complicated operations will have much better speedups, of course!
@wolfv happy to hear it! looking forward to getting back on this a bit later in the week and running some speed/memory benchmarks and also to finish up that other PR.
yeah – unfortunately the other PR will need a rebase now ... hope that's alright.
not a problem!
This PR contains two changes:
r_cast<...>(...)
to case e.g. a numeric vector to integer vector. Note that this will emit a warning if the warning is not suppressed, because I think this can be quite expensive.Unfortunately, Rcpp chose the rather generic
update()
name for the method that is called on the CRTP base class after the SEXP has changed. But I think this does not (yet) collide with any method that we have in xtensor.@DavisVaughan I hope this fixes the memory problems you were seeing! :)