apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.26k stars 1.18k forks source link

Remove unnecessary null checks in `GroupColumn`s #12944

Open Rachelint opened 3 weeks ago

Rachelint commented 3 weeks ago

Is your feature request related to a problem or challenge?

As mentioned by @Dandandan in https://github.com/apache/datafusion/pull/12809#discussion_r1801678804

Some null checks are actullay unnecessary for arrays containing no nulls (basically we can just use null_count to check it).

Describe the solution you'd like

As mentioned above, the simple way is using null_count in array to make it.

Furtherly, I found we indeed check which rows are nulls in create_hashes. I think maybe we can reuse this result?

Describe alternatives you've considered

No response

Additional context

No response

Rachelint commented 3 weeks ago

I am working on a poc #12947 for checking the improvement of this idea.

Rachelint commented 3 weeks ago

I found it actually now a trivial change.

The challenge is that, we should refactor codes to support vectorize append firstly.

(otherwise, we need to add a branch about which append function we should call for each row's appending, and as a reuslt, we remove a row level branch and add another...)

I am trying it.

Rachelint commented 3 weeks ago

This is the number get from my local:

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ vectorize-append-value ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.69ms │                 0.67ms │     no change │
│ QQuery 1     │    67.63ms │                67.32ms │     no change │
│ QQuery 2     │   162.49ms │               164.49ms │     no change │
│ QQuery 3     │   181.03ms │               184.10ms │     no change │
│ QQuery 4     │  1620.87ms │              1601.20ms │     no change │
│ QQuery 5     │  1533.80ms │              1536.84ms │     no change │
│ QQuery 6     │    59.69ms │                61.10ms │     no change │
│ QQuery 7     │    76.13ms │                76.61ms │     no change │
│ QQuery 8     │  2021.02ms │              2143.10ms │  1.06x slower │
│ QQuery 9     │  1926.36ms │              1929.56ms │     no change │
│ QQuery 10    │   526.58ms │               520.71ms │     no change │
│ QQuery 11    │   580.63ms │               595.39ms │     no change │
│ QQuery 12    │  1839.30ms │              1821.41ms │     no change │
│ QQuery 13    │  2994.22ms │              2903.39ms │     no change │
│ QQuery 14    │  2082.50ms │              2030.18ms │     no change │
│ QQuery 15    │  1890.10ms │              1859.78ms │     no change │
│ QQuery 16    │  4108.74ms │              4145.29ms │     no change │
│ QQuery 17    │  3640.05ms │              3663.88ms │     no change │
│ QQuery 18    │  8300.44ms │              8089.79ms │     no change │
│ QQuery 19    │   143.29ms │               143.35ms │     no change │
│ QQuery 20    │  3250.19ms │              3218.41ms │     no change │
│ QQuery 21    │  3909.34ms │              3947.90ms │     no change │
│ QQuery 22    │  9146.31ms │              9170.28ms │     no change │
│ QQuery 23    │ 23824.79ms │             23844.20ms │     no change │
│ QQuery 24    │  1120.92ms │              1125.08ms │     no change │
│ QQuery 25    │   995.13ms │               991.87ms │     no change │
│ QQuery 26    │  1320.37ms │              1298.79ms │     no change │
│ QQuery 27    │  4680.98ms │              4673.34ms │     no change │
│ QQuery 28    │ 23892.37ms │             23543.21ms │     no change │
│ QQuery 29    │   904.17ms │               883.88ms │     no change │
│ QQuery 30    │  1832.41ms │              1798.48ms │     no change │
│ QQuery 31    │  2021.98ms │              1958.83ms │     no change │
│ QQuery 32    │  7319.30ms │              7567.96ms │     no change │
│ QQuery 33    │  9575.96ms │              9609.05ms │     no change │
│ QQuery 34    │  9712.54ms │              9647.64ms │     no change │
│ QQuery 35    │  2772.59ms │              2904.98ms │     no change │
│ QQuery 36    │   248.24ms │               251.74ms │     no change │
│ QQuery 37    │   155.42ms │               150.51ms │     no change │
│ QQuery 38    │   153.19ms │               152.12ms │     no change │
│ QQuery 39    │   642.92ms │               587.60ms │ +1.09x faster │
│ QQuery 40    │    57.70ms │                60.29ms │     no change │
│ QQuery 41    │    53.35ms │                56.26ms │  1.05x slower │
│ QQuery 42    │    66.18ms │                66.08ms │     no change │
└──────────────┴────────────┴────────────────────────┴───────────────┘
Dandandan commented 3 weeks ago

Makes sense to vectorize appends at the same time, I think this might have even more impact than just omitting null checks. Removing the null check, but adding another branch will likely not yield any benefit.

Rachelint commented 3 weeks ago

Makes sense to vectorize appends at the same time, I think this might have even more impact than just omitting null checks. Removing the null check, but adding another branch will likely not yield any benefit.

I wonder should we vectorize eq together.

I impl a simple version only vectorize the append operation #12996
The benchmark number is generated from it.

But it introduce another branch in the equal_to stage, which I think maybe switching to the all vectorize approach like duckdb can eliminate this new branch.