Open jamesblackburn opened 8 years ago
Wes McKinney has been working on Arrow: https://arrow.apache.org/ https://github.com/apache/arrow as a DataFrame serialization and interoperability layer: https://blog.cloudera.com/blog/2016/02/introducing-apache-arrow-a-fast-interoperable-in-memory-columnar-data-structure-standard/ http://wesmckinney.com/blog/feather-and-apache-arrow/
Arrow has seen a fair bit of buy-in as a common data layer from the wider data science community, including interop with: Spark, Pandas, Drill, Impala and Cassandra, HBase and others on the storage side.
Due to its uptake, arrow also became an Apache Top-Level project avoiding the incubator: http://www.theregister.co.uk/2016/02/17/apache_arrow_toplevel_project/
If we make arctic Arrow-compatible it may make it easier to integrate arctic with downstream data processing systems.
I'm curious, do you foresee this affecting the storage spec? Specifically,
DataFrame
Wes McKinney has been working on Arrow: https://arrow.apache.org/ https://github.com/apache/arrow as a DataFrame serialization and interoperability layer: https://blog.cloudera.com/blog/2016/02/introducing-apache-arrow-a-fast-interoperable-in-memory-columnar-data-structure-standard/ http://wesmckinney.com/blog/feather-and-apache-arrow/
Arrow has seen a fair bit of buy-in as a common data layer from the wider data science community, including interop with: Spark, Pandas, Drill, Impala and Cassandra, HBase and others on the storage side.
Due to its uptake, arrow also became an Apache Top-Level project avoiding the incubator: http://www.theregister.co.uk/2016/02/17/apache_arrow_toplevel_project/
If we make arctic Arrow-compatible it may make it easier to integrate arctic with downstream data processing systems.