R-Lum / Luminescence

Development of the R package 'Luminescence'
http://r-lum.github.io/Luminescence/
GNU General Public License v3.0
15 stars 7 forks source link

Investigate switching from Rcpp to cpp11 #297

Open mcol opened 1 month ago

mcol commented 1 month ago

Package Rcpp is very convenient but also quite bloated. Perhaps using cpp11 can provide the same convenience but with a smaller footprint. We should try to convert some of our functions and see if it's a viable approach.

mcol commented 4 weeks ago

I've converted create_UID.cpp and the result is this:

#include <R_ext/Random.h>
#include <cpp11.hpp>
#include <time.h>

// [[cpp11::register]]
cpp11::strings create_UID() {

  //define variables
  time_t rawtime;
  struct tm * timeinfo;
  char timestamp [80];

  //set date + timestamp (code snippet taken from C++ reference page)
  time (&rawtime);
  timeinfo = localtime (&rawtime);
  strftime (timestamp,80,"%Y-%m-%d-%I:%M.",timeinfo);

  // get a 4-digit random number
  //according to the CRAN policy the standard C-function, rand(), even sufficient here, is not allowed
  GetRNGstate();
  unsigned int random = (unsigned int) (unif_rand() * 10000);
  PutRNGstate();

  //combine and return results
  sprintf(timestamp, "%s%04u", timestamp, random);
  return cpp11::writable::strings( { timestamp } );
}

Here I shortened the random number to 4 digits (which may be sufficient for our purposes) as cpp11 doesn't provide an automatic way of converting a float to a string as was done previously.

Overall it was not too hard, but it took me a bit of back and forth to get the string operations to compile (the docs barely talk about strings). However I'm still getting a couple of C++ warnings for the string concatenation:

   create_UID.cpp: In function ‘cpp11::strings create_UID()’:
   create_UID.cpp:34:25: warning: ‘%04u’ directive writing between 4 and 10 bytes into a region of size between 1 and 80 [-Wformat-overflow=]
      34 |   sprintf(timestamp, "%s%04u", timestamp, random);
         |                         ^~~~
   create_UID.cpp:34:10: note: ‘sprintf’ output between 5 and 90 bytes into a destination of size 80
      34 |   sprintf(timestamp, "%s%04u", timestamp, random);
         |   ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   create_UID.cpp:34:10: warning: ‘sprintf’ argument 3 overlaps destination object ‘timestamp’ [-Wrestrict]
   create_UID.cpp:20:8: note: destination object referenced by ‘restrict’-qualified argument 1 was declared here
      20 |   char timestamp [80];
         |        ^~~~~~~~~

Major differences:

So overall it's hard to say from this if it's worth continuing. I'll probably have to try a more involved function to see.

RLumSK commented 4 weeks ago

Sorry, I don't have good advice here, I would need to sit down with the code myself; currently I am little bit under time constraints.

mcol commented 4 weeks ago

No worries, I was just summarising my current thoughts. As I mentioned, I'll get back to it at some point to see if it's worth carrying on with this.

mcol commented 4 days ago

This may be blocked by the performance implication for analyse_IRSAR.RF(): the src_analyse_IRSARRF_SRS() C++ function uses RcppArmadillo both for its arma::vec container and for RcppArmadillo::sample(), and changing these may bring performance penalties.

I had tried to use NumericVector() instead of arma::vec, but already this seemingly innocuous change caused a little slow down; replacing RcppArmadillo::sample() with Rcpp::sample() seemed pretty neutral, but for the move to cpp11, we would ideally move the sampling to R, and that may also cause a slowdown.

RLumSK commented 4 days ago

OK, then we keep it as it is for now.