vmzakharov / dataframe-ec

A tabular data structure (aka a data frame) based on the Eclipse Collections framework
MIT License
49 stars 10 forks source link

Please support Apache Arrow Frame compatibility #25

Open thesteve0 opened 2 weeks ago

thesteve0 commented 2 weeks ago

Adding support for Apache Arrow would greatly increase the interoperability between application developers and data scientists. https://arrow.apache.org/

It would be much more efficient to share data structures between dataframe-ec and many of the other dataframe like frameworks. It would also allow python users to move their work to Java for the more memory intensive operations.

There is already a Java library for Arrow. https://arrow.apache.org/docs/java/ https://central.sonatype.com/search?q=g:org.apache.arrow%20%20v:17.0.0&smo=true

Thanks for the great work!

vmzakharov commented 1 week ago

Thank you for opening the issue. Will look into this. Note that it will be a separate project.