djvanderlaan / lvec

Handling larger than memory vectors in R - core package
9 stars 1 forks source link

Unify classes (c++) #1

Open djvanderlaan opened 7 years ago

djvanderlaan commented 7 years ago

We currently have a cppr::logical and a cppr::boolean; a cppr::rstring and std::string; a cppr::integer and a int; a cppr::numeric and a double. Try to unify this: one of each type. This might also reduce the code in as_rvec for example.

For cppr::rstring this has been resolved: std::string is now used and cppr::rstring no longer exists.

eddelbuettel commented 6 years ago

How would you feel about unifying around what Rcpp does? We have close to ten years of bug fixes in.

djvanderlaan commented 6 years ago

@eddelbuettel You are fast. The package has only been on CRAN for a few hours.

Rcpp is a really nice package, but one of the disadvantages in my opinion is it's size (in stuff it implements). My own little header file fits into my head. I only need a fraction of what is implemented in Rcpp. A lightweight Rcpp would be nice. However, I am not against using Rcpp; I would indeed have the advantage or more stable code implemented by people much more versed in c++ than I am. I am a bit afraid that I would have to use things from Rcpp that are not very well documented and would have to dive into the internals of Rcpp. I am not really looking forward to that.

To be honest, I am not completely sure how to read your comment. So, I'll just ask. One could read it as 'Why are you being so headstrong and silly? I have this fantastic package that everyone is using, and you decide to write something yourself. Stupid you.' However also as 'I honestly feel that your package could improve, if you would use rcpp and should you need any pointers for that, I could help.' But it can, of course, also be both ;-)

eddelbuettel commented 6 years ago

Re 'fast': Well I built myself CRANberries (and on Twitter as @CRANberriesFeed) to keep on top of CRAN...

As for lvec, I like a lot of what I saw here, but the first (and so far only) file I opened looked like it replicated stuff:

#include "ldat.h"
#include "r_export.h"
#include <cstring>
#include <stdexcept>

extern "C" {
  SEXP as_lvec(SEXP rv) {
    CPPRTRY
    if (cppr::is<cppr::numeric>(rv)) {
    //... much more omitted

That may be suboptimal in the long run but is maybe small fries in the large. You clearly have stuff we're missing and that I thought about (ie rds serialization) and fast (mabe mmap-ed ?) disk access would be cool. I also think that many of your users will use Rcpp anyway -- it is used by about 35% of all packages on CRAN having compiled code, so we may as well help.

Then again, you use C++11 and Boost and we can't. But maybe, just maybe, we can align things and push things that are duplicated down to Rcpp, allowing lvec to be used on top not unlike other packages built on top? Just an innocent question so far :)

The big item really is that we are by construction very very close as we both communicate via R's SEXP types. In that sense, maybe we can do a little '2 + 2' trick and arrive at a little more than four :)

Lastly, installed here you take up 2.6mb. Which is exactly what the only other add-on of mine (with C++11 and external code) I looked at (RcppTOML, if you're curious) has. So ;-)