jorgecarleitao / arrow2

Transmute-free Rust library to work with the Arrow format
Apache License 2.0
1.06k stars 222 forks source link

Recommendation for implementation async file sinks #1438

Open chitralverma opened 1 year ago

chitralverma commented 1 year ago

From this issue https://github.com/jorgecarleitao/arrow2/issues/995, I was able to implement a parquet and IPC writer that uses object_store crate to write a Polars DF to any store (local or cloud). This is because, for both parquet and IPC formats, arrow2 exposes an async FileSink.

For other remaining formats like CSV, ndjson, ORC and Avro can you please suggest me an approach through which I can implement an Async writer following the trait bounds of AsyncWrite + Unpin + Send?

I'm happy to raise a PR for these formats if I can receive guidance on this or find a way to re-purpose the existing non-async writers.