cmu-db / optd

CMU-DB's Cascades optimizer framework
https://cmu-db.github.io/optd/
MIT License
373 stars 22 forks source link

fix: 1.9x improvement to median q-error by fixing multi-equality join selectivity #171

Closed wangpatrick57 closed 6 months ago

wangpatrick57 commented 6 months ago

Summary: Previously, we computed multi-equality join selectivity by building an MST of the join graph. However, the correct method is to take the N-1 nodes with the highest n-distinct values.

Demo: This fix causes us to beat Postgres on median q-error for the first time ever. We also now beat Postgres on p90 q-error for the first time ever. Overall, it improves our median q-error by 1.9x, p90 q-error by 3.4x, p95 q-error by 42.1x, p99 q-error by 2.6x, and lets us beat Postgres on 9 queries we previously didn't beat them on.

Before (after changing DEFAULT_PRECISION and DEFAULT_K_TO_TRACK but before multi-equality fix): Screenshot 2024-04-27 at 16 00 18

After: Screenshot 2024-04-27 at 20 02 00

Details: