sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Add Arrow Support #186

Open sunchao opened 5 years ago

sunchao commented 5 years ago

This is the umbrella ticket to track adding Apache Arrow support. Tasks:

liurenjie1024 commented 5 years ago

I think the next tasks will be:

sunchao commented 5 years ago

Thanks @liurenjie1024 . Updated the description for some potential tasks.

sadikovi commented 5 years ago

I suggest adding an item to update the existing doc to reflect the addition of arrow reader/writer.

andygrove commented 5 years ago

DataFusion has code for loading parquet into arrow ... might be worth looking at

On Thu, Nov 8, 2018 at 4:47 AM Ivan notifications@github.com wrote:

I suggest adding an item to update the existing doc to reflect the addition of arrow reader/writer.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/sunchao/parquet-rs/issues/186#issuecomment-436982663, or mute the thread https://github.com/notifications/unsubscribe-auth/AA5AxEntUdq27cqNRXZ8yJV8FTxsZMIXks5utCf3gaJpZM4YP3X3 .

sunchao commented 5 years ago

@sadikovi Thanks - added. @andygrove cool - will take a look.

liurenjie1024 commented 5 years ago

@andygrove Yes, I'll take that as a reference. Also I'll also reference the cpp implementation of arrow adapter of parquet.

andygrove commented 5 years ago

I am very interested in this. I am wondering if we can add a generic reader trait to the main arrow project and then have an implementation in parquet-rs.

I have a CSV reader for arrow that could be published as a separate crate and implement the same trait.

andygrove commented 5 years ago

Actually, maybe this is as simple as implementing Iterator<Arc<RecordBatch>>