coiled / benchmarks

BSD 3-Clause "New" or "Revised" License
32 stars 17 forks source link

Benchmark plan generation and optimization on TPC-H queries #1326

Closed hendrikmakait closed 9 months ago

hendrikmakait commented 9 months ago

Closes #1325

Blocked by and includes #1324

hendrikmakait commented 9 months ago

Benchmark results look odd given that Q18 takes 22 - 25 seconds to complete and 80 seconds to optimize.

Screenshot 2024-02-01 at 10 44 31

EDIT: This may have been caused by me using a LocalCluster, the query sets an index and requires some pre-computation.

phofl commented 9 months ago

EDIT: This may have been caused by me using a LocalCluster, the query sets an index and requires some pre-computation.

Correct, that's where I would look if optimization time is off, why is this necessary though? It seems like we are shooting ourselves in the foot with the set_index call. We don't need it as far as I can see, using table.merge(qnt_over_300, left_on="l_orderkey", right_index=True) should work

Another thought: We should never use query, there is literally no case where I would recommend using query.

hendrikmakait commented 9 months ago

This looks better:

Screenshot 2024-02-01 at 12 26 08