apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.35k stars 3.49k forks source link

[Java/Python] in-process vector sharing from Java to Python #18209

Open asfimport opened 6 years ago

asfimport commented 6 years ago

Currently we seem to use in all applications of Arrow the IPC capabilities to move data between a Java process and a Python process. While this is 0-serialization, it is not zero-copy. By taking the address and offset, we can already create Python buffers from Java buffers: https://github.com/apache/arrow/pull/1693. This is still a very low-level interface and we should provide the user with:

Reporter: Uwe Korn / @xhochy

Related issues:

Note: This issue was originally created as ARROW-2249. Please see the migration documentation for further details.

asfimport commented 5 years ago

Wes McKinney / @wesm: @xhochy you have made a lot of progress here, but a good amount of work still remains? I will move to 0.13 for now

asfimport commented 3 years ago

Antoine Pitrou / @pitrou: Ideally, the C data interface would make this nearly seemless and avoid maintaining a specific Python-Java bridge, but it first must be implemented by the Java Arrow implementation: ARROW-12965

asfimport commented 2 years ago

Todd Farmer / @toddfarmer: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.