I want to query hive-partitioned parquet files in S3 using the python DataFusion client. Currently, pyarrow datasets are already supported, but I've found performance to be lacking. Instead, I'd rather use object_store. The python bindings already support creating object stores, so we only need to expose register_listing_table for this.
I want to query hive-partitioned parquet files in S3 using the python DataFusion client. Currently, pyarrow datasets are already supported, but I've found performance to be lacking. Instead, I'd rather use
object_store
. The python bindings already support creating object stores, so we only need to exposeregister_listing_table
for this.