Closed ajinkya-k closed 2 weeks ago
Do you want to operate on the arrow table with libarrow or some custom C++ code (potentially using other libraries)?
I want to use custom C++ code that will need other libraries
Sorry for the late reply, I am not sure that's possible with cpp11 (which the arrow uses) but that's not my speciality. I found this related issue: https://github.com/apache/arrow/issues/36274
So is this an inherent limitation of cpp11?
I don't really know, sorry. Maybe @jonkeane or @paleolimbot can chime in?
It seems like this should be possible @ajinkya-k, see https://github.com/apache/arrow/issues/36274#issuecomment-1607431346 and let us know if you think you could adapt that code to your use case. Also note the caveats in that thread.
Hi @amoeba, thanks for sharing the thread. As is clear in the thread, there is no guarantee of stability which means I cannot roll it up into a package. I was hoping there would be a more stable and permanent way to do this. If not, it might be worth putting in a feature request.
I think being able to access the exact same Arrow object from both R
and C++
would be very important to enable more scalable Bayesian analyses that have to rely on C++
code out of necessity. In some of the applications that I am thinking of, summary statistics of specific subsets of the data are required to be computed in C++
. This can be very efficiently be achieved using filter
and group_by + summarize
in C++
. But in every iteration of the MCMC loop the subset of units to be filered on or grouped will differ. This is why the arrow object must be available in C++
The examples given in https://github.com/apache/arrow/issues/36274 should be stable because they use the Arrow C Data Interface, with the help of the nanoarrow
package, to pass the arrow::Table
between C++ and R. My interpretation of @paleolimbot 's comment was that it's specifically passing pointers to arrow::Table
s that's not considered stable. But going through the C Data Interface is stable and is even the Arrow project's recommended way of doing this kind of thing.
Thanks! I will give it a try
Hi @ajinkya-k, I'm going to close this for now but please feel free to re-open and/or comment here. I'm curious if you were able to get something to work.
Describe the usage question you have. Please include as many useful details as possible.
I was curious how I can pass arrow objects from
R
toC++
(kind of likeR
vectors viaRcpp::NumericVector
). Here's an example of what I am looking for:Say I have a function
sample_post
inR
that takes in anarrow
table and some parameters:For a little more concreteness, let's say
some_fn_cpp
is doinggroup_by
summaries in each iteration of a loop. Insome_fn_rcpp
what should be the type for the first argument that corresponds toarrow_tbl
?NOTE: I would prefer using
Rcpp
but not tied to it. I am okay using something else.Component(s)
R