Open eddyxu opened 11 months ago
TPCH, assuming a system is solid, is primarily a benchmark of the hash-join implementation. Its designed to be more compute intensive and less scan intensive. So I don't know that it is all that interesting a measurement for file formats.
TPCH, assuming a system is solid, is primarily a benchmark of the hash-join implementation. Its designed to be more compute intensive and less scan intensive. So I don't know that it is all that interesting a measurement for file formats.
Some of the TPCH queries have no joins. Queries 1
and Queries 6
in particular. These have been historically slow when comparing Parquet vs Lance. Parquet outperformed Lance by orders of magnitude.
Benchmark TPCH performance against parquet / orc.