Open alamb opened 5 months ago
I wonder if this is really slower or it is just noise.
Note that the benchmark runs on c6a.4xlarge
and EBS (gp2), which contribute to variations in performance (i.e. load from other users).
I wonder if this is really slower or it is just noise.
Note that the benchmark runs on
c6a.4xlarge
and EBS (gp2), which contribute to variations in performance (i.e. load from other users).
I wondered the same thing but @kmitchener seems to have been able to reproduce the difference reliably https://github.com/apache/arrow-datafusion/issues/8789#issuecomment-1883645578 🤔
Update here is that we see the same small slowdown in version 36.
I was thinking perhaps it could be due to the overhead of reading/parsing per-file metadata. More details here: https://github.com/apache/arrow-datafusion/issues/9404#issuecomment-1986804684
Describe the bug
As part of https://github.com/apache/arrow-datafusion/issues/8789, @kmitchener ran the ClickBench results using DataFusion 34.0.0 and compared to DataFusion 33.0.0 they appear to go slightly slower.
I would like to know why the benchmark shows it going slightly slower
To Reproduce
He ran the v33 benchmarks on the same instance and modified the benchmark so it will display both 33 and 34 at the same time so you can compare the runs:![image](https://github.com/apache/arrow-datafusion/assets/692497/130a7257-806a-4156-ba91-0a944c1467a6)
You can grab that from -> https://github.com/kmitchener/ClickBench/blob/new-run-of-datafusion-33/index.html
Expected behavior
Each release should be as good or better than the last
Additional context
No response