apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.73k stars 6.81k forks source link

interop with Apache Arrow #14509

Open KiaraGrouwstra opened 5 years ago

KiaraGrouwstra commented 5 years ago

The original MXNet proposal suggested:

Other Apache projects that are potentially complimentary:

Apache Arrow - read data in Apache Arrow‘s internal format from MXNet, that would allow users to run ETL/preprocessing in Spark, save the results in Arrow’s format and then run DL algorithms on it.

I didn't see an existing thread on Apache Arrow here, so figured I'd create one.

While the above only mentioned reading out data from MXNet in Arrow format, there may similarly be value in reading in data from Arrow as well.

mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Feature

ChaiBapchya commented 5 years ago

@mxnet-label-bot add [Feature request]

eric-haibin-lin commented 5 years ago

@tycho01 if you are familiar with Apache Arrow, contribution is welcome! We can probably start with a gluon.data.Dataset class that reads a file in apache Arrow format