Open a10y opened 6 days ago
Attaching two zips, one with TRACE-level logs of executing all TPC-H queries (except q15) using the Vortex Datafusion provider.
s3express_vortex.zip s3_vortex.zip
Some interesting bits:
Total number of IO's to perform each query:
Total time to execute the query (not including table registration)
S3 normal:
S3 Express One:
I'm using this PR as a space to collect some info about running the TPC-H queries against object storage. Goals are to compare
Against storage backends
Changes
This PR creates a new binary that runs every TPC-H query while logging IOs in our objectstore reader, allowing us to examine both request sizes and request counts for each query.
Parquet and Vortex are each selectable, and the bucket is also configurable.
To run the test that uses S3 Express One Zone, you need to set
AWS_S3_EXPRESS=true
in your.env
or directly in your shell environment