apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.34k stars 3.48k forks source link

[Python] Expose streaming JSON reader #14932

Open pitrou opened 1 year ago

pitrou commented 1 year ago

Describe the enhancement requested

https://github.com/apache/arrow/pull/14355 added a JSON streaming reader on the C++ side. We should expose a Python binding to it, like we did for CSV.

Component(s)

Python

akshaysu12 commented 1 year ago

I'd like to give this a shot if possible? I've been working on it a bit over the holidays following the commit that added the streaming csv reader python bindings.

pitrou commented 1 year ago

@akshaysu12 Yes, please do!

akshaysu12 commented 1 year ago

@pitrou sorry for the delay! I added a Draft PR here: https://github.com/apache/arrow/pull/33761

It's missing documentation but I was hoping to get a look to make sure I'm going about this the right way since I'm new to the project.

pitrou commented 1 year ago

Thanks @akshaysu12 for notifying me! I've cc'ed the relevant people on your PR.