apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.43k stars 3.51k forks source link

[C++][Dataset] Implement Dataset for reading JSON format #30588

Open asfimport opened 2 years ago

asfimport commented 2 years ago

We already have support for reading individual files, but not yet for reading datasets.

Reporter: Will Jones / @wjones127 Assignee: Ben Harkins / @benibus

Subtasks:

Note: This issue was originally created as ARROW-15075. Please see the migration documentation for further details.

asfimport commented 2 years ago

Edward Visel / @alistaire47: After this is implemented, the Google Quickdraw dataset is a nice freely-available dataset in ndjson to use for benchmarking and demos and such

asfimport commented 2 years ago

Antoine Pitrou / @pitrou: cc @benibus