apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.04k stars 1.14k forks source link

Implement special Groups for StringViews #12771

Open alamb opened 1 day ago

alamb commented 1 day ago

Is your feature request related to a problem or challenge?

Part of https://github.com/apache/datafusion/issues/12680

In https://github.com/apache/datafusion/pull/12269 @jayzhan211 made significant improvements to how group values are stored in multi-column aggregations. This requires specialized implementations for different column types

His initial PR has implementations for PrimitiveArray and String/Binary. However it does not have a specialization for StringView

So that means that queries that group on multiple columns are even faster. This shows up by effectively slowing down some clickbench queries when they are run with StringView:

For example, this query is 10% slower with StringView

SELECT "SearchEngineID", "SearchPhrase", COUNT(*) AS c FROM 'hits.parquet' WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "SearchPhrase" ORDER BY c DESC LIMIT 10;

Describe the solution you'd like

I would like to make this (and similar) query faster when string view is enabled :

SELECT "SearchEngineID", "SearchPhrase", COUNT(*) AS c FROM 'hits.parquet' WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "SearchPhrase" ORDER BY c DESC LIMIT 10;

Note this is grouping by 2 columns

Here is how to reproduce the issue

Step 1. Get hits.parquet using bench.sh:

cd benchmarks
./bench.sh data clickbench_1

Step 2: Prepare a script with reproducer query:

set datafusion.execution.parquet.schema_force_view_types = true;

SELECT "SearchEngineID", "SearchPhrase", COUNT(*) AS c FROM 'hits.parquet' WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "SearchPhrase" ORDER BY c DESC LIMIT 10;

Step 3: Run query

(venv) andrewlamb@Andrews-MacBook-Pro-2:~/Downloads$ datafusion-cli -f q.sql

Describe alternatives you've considered

I suggest implementing something like ByteViewGroupValueBuilder following the model of ByteGroupValueBuilder

https://github.com/apache/datafusion/blob/6f8c74ca873fe015b2e7ff4eeb2f76bf21ae9d0e/datafusion/physical-plan/src/aggregates/group_values/group_column.rs#L177

The in progress values would be u128s and some buffers (maybe 2MB?)

implementing equal_to can take advantage of the inlined prefix optimization (aka compare the prefix inlined in the u128 and only check the value in the buffer if that is already equal)

Additional context

No response

alamb commented 1 day ago

BTW here are the flamegraphs:

flamegraph-string flamegraph-stringview

Screenshot 2024-10-05 at 9 26 48 AM

alamb commented 1 day ago

FYI @Rachelint / @jayzhan211 -- this might be an interesting project

Rachelint commented 1 day ago

Actually interesting, I am willing to help push it forward

Rachelint commented 1 day ago

take

alamb commented 14 hours ago

I think it will be quite a cool optimization -- specifically checking for equal values can likely be optimized using the inlined prefix