Open danthegoodman1 opened 1 year ago
Being able to select pyarrow tables without copying, as well as accessing results as pyarrow tables without copying would be massively beneficial to building low-latency ETL pipelines and other data processing pipelines.
Specifically having streaming support would be massive too: https://duckdb.org/2021/12/03/duck-arrow.html#streaming-data-fromto-arrow as this would greatly reduce the required memory usage for queries and post-processing of data
Faster path of query on ArrowTable is done on v2.0.0b1 Example: https://github.com/chdb-io/chdb/blob/main/tests/test_query_py.py#L94
Being able to select pyarrow tables without copying, as well as accessing results as pyarrow tables without copying would be massively beneficial to building low-latency ETL pipelines and other data processing pipelines.
Specifically having streaming support would be massive too: https://duckdb.org/2021/12/03/duck-arrow.html#streaming-data-fromto-arrow as this would greatly reduce the required memory usage for queries and post-processing of data