Open mtanski opened 6 years ago
This should be possible to implement, for an example on how we do this between Python/C++ and Java see: https://github.com/apache/arrow/pull/2062
In principle yes, but in practice I don't know how many extra steps are required on top of what's already here to implement something like that.
In theory I think all that's needed would be for you to dump some data into arrow format (e.g. calling arrowformat
on some arrays and using writepadded
into a buffer) and then use the appropriate protocol to communicate between the two programs (admittedly I don't know what this part looks like yet).
Help is welcome, but otherwise, stay tuned, there's likely to be some movement on this package in the coming weeks as we try to get it compliant with the main arrow repo.
How about zero copy in the same address space? Transfer (or better yet, borrow) C++ Arrow table and use it in Julia. My use case is sharing data between C++ and Julia, where the Julia code would be called in the call back (the borrow case) or Julia code would be using the result of the operation (consume, but 0 copy).
My understanding of how that would work is basically the following:
Vector{UInt8}
(could be an IOBuffer
that contains one). It would be up to whatever you use to get that array to make sure this is 0 copy. Unfortunately at the moment I'm totally ignorant about what would be used to perform this initial step, but hopefully it's something simple.Locate
interface (see README). It may be that there is some sort of standard format and metadata for IPC, in fact I think there was at least some of that, that's something that still should be implemented in Arrow.jl that isn't. In any case, creating the ArrowVector
objects will not do any copying.ArrowVector
objects which you can read from however you want. The semantics are the same as Array
. So, if you have an ArrowVector
v
and do v[idx]
this will create a copy for the indices idx
. If you do view(v, idx)
or @view v[idx]
, this will create a view so that there is no copying.Sorry I can't be of more help, certainly this is not enough to get something really polished, but perhaps it's enough for a rough implementation? Again, having something really polished for this depends to a large degree on the standardization of data layouts, the Arrow format is quite general. (I need to go back and review the IPC stuff though, there's probably something.)
I'm interested in passing Tabular data between C++ and Julia. Is it possible to do this in the same address space using Julia Arrow and C++ Arrow libraries?