Open Rachelint opened 3 weeks ago
I am working on a poc #12947 for checking the improvement of this idea.
I found it actually now a trivial change.
The challenge is that, we should refactor codes to support vectorize append firstly.
(otherwise, we need to add a branch about which append function we should call for each row's appending, and as a reuslt, we remove a row level branch and add another...)
I am trying it.
This is the number get from my local:
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query ┃ main ┃ vectorize-append-value ┃ Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │ 0.69ms │ 0.67ms │ no change │
│ QQuery 1 │ 67.63ms │ 67.32ms │ no change │
│ QQuery 2 │ 162.49ms │ 164.49ms │ no change │
│ QQuery 3 │ 181.03ms │ 184.10ms │ no change │
│ QQuery 4 │ 1620.87ms │ 1601.20ms │ no change │
│ QQuery 5 │ 1533.80ms │ 1536.84ms │ no change │
│ QQuery 6 │ 59.69ms │ 61.10ms │ no change │
│ QQuery 7 │ 76.13ms │ 76.61ms │ no change │
│ QQuery 8 │ 2021.02ms │ 2143.10ms │ 1.06x slower │
│ QQuery 9 │ 1926.36ms │ 1929.56ms │ no change │
│ QQuery 10 │ 526.58ms │ 520.71ms │ no change │
│ QQuery 11 │ 580.63ms │ 595.39ms │ no change │
│ QQuery 12 │ 1839.30ms │ 1821.41ms │ no change │
│ QQuery 13 │ 2994.22ms │ 2903.39ms │ no change │
│ QQuery 14 │ 2082.50ms │ 2030.18ms │ no change │
│ QQuery 15 │ 1890.10ms │ 1859.78ms │ no change │
│ QQuery 16 │ 4108.74ms │ 4145.29ms │ no change │
│ QQuery 17 │ 3640.05ms │ 3663.88ms │ no change │
│ QQuery 18 │ 8300.44ms │ 8089.79ms │ no change │
│ QQuery 19 │ 143.29ms │ 143.35ms │ no change │
│ QQuery 20 │ 3250.19ms │ 3218.41ms │ no change │
│ QQuery 21 │ 3909.34ms │ 3947.90ms │ no change │
│ QQuery 22 │ 9146.31ms │ 9170.28ms │ no change │
│ QQuery 23 │ 23824.79ms │ 23844.20ms │ no change │
│ QQuery 24 │ 1120.92ms │ 1125.08ms │ no change │
│ QQuery 25 │ 995.13ms │ 991.87ms │ no change │
│ QQuery 26 │ 1320.37ms │ 1298.79ms │ no change │
│ QQuery 27 │ 4680.98ms │ 4673.34ms │ no change │
│ QQuery 28 │ 23892.37ms │ 23543.21ms │ no change │
│ QQuery 29 │ 904.17ms │ 883.88ms │ no change │
│ QQuery 30 │ 1832.41ms │ 1798.48ms │ no change │
│ QQuery 31 │ 2021.98ms │ 1958.83ms │ no change │
│ QQuery 32 │ 7319.30ms │ 7567.96ms │ no change │
│ QQuery 33 │ 9575.96ms │ 9609.05ms │ no change │
│ QQuery 34 │ 9712.54ms │ 9647.64ms │ no change │
│ QQuery 35 │ 2772.59ms │ 2904.98ms │ no change │
│ QQuery 36 │ 248.24ms │ 251.74ms │ no change │
│ QQuery 37 │ 155.42ms │ 150.51ms │ no change │
│ QQuery 38 │ 153.19ms │ 152.12ms │ no change │
│ QQuery 39 │ 642.92ms │ 587.60ms │ +1.09x faster │
│ QQuery 40 │ 57.70ms │ 60.29ms │ no change │
│ QQuery 41 │ 53.35ms │ 56.26ms │ 1.05x slower │
│ QQuery 42 │ 66.18ms │ 66.08ms │ no change │
└──────────────┴────────────┴────────────────────────┴───────────────┘
Makes sense to vectorize appends at the same time, I think this might have even more impact than just omitting null checks. Removing the null check, but adding another branch will likely not yield any benefit.
Makes sense to vectorize appends at the same time, I think this might have even more impact than just omitting null checks. Removing the null check, but adding another branch will likely not yield any benefit.
I wonder should we vectorize eq together.
I impl a simple version only vectorize the append operation #12996
The benchmark number is generated from it.
But it introduce another branch in the equal_to
stage, which I think maybe switching to the all vectorize approach like duckdb
can eliminate this new branch.
Is your feature request related to a problem or challenge?
As mentioned by @Dandandan in https://github.com/apache/datafusion/pull/12809#discussion_r1801678804
Some null checks are actullay unnecessary for arrays containing no nulls (basically we can just use
null_count
to check it).Describe the solution you'd like
As mentioned above, the simple way is using
null_count
in array to make it.Furtherly, I found we indeed check which rows are nulls in
create_hashes
. I think maybe we can reuse this result?Describe alternatives you've considered
No response
Additional context
No response