Open rupamk opened 5 years ago
Interesting idea.
Explode Operator
There is similar to GroupIdOperator
operator in Presto. It is used to generate input for grouping set aggregation. We also multiplicate pages there (for each grouping set).
a. Hint framework to get inputs from user (can be extended beyond Skew Join Implementation)
I'm not sure that would be the way to go. We rather try to avoid user hinting in Presto because it creates legacy syntax which is hard to remove later on. It would be great if CBO could help in such case automatically.
Would it be possible if for your benchmarks you also attached results with BROADCAST join enabled?
Would it be possible if for your benchmarks you also attached results with BROADCAST join enabled? Yes I will update the Doc.
I'm not sure that would be the way to go. We rather try to avoid user hinting in Presto because it creates legacy syntax which is hard to remove later on. It would be great if CBO could help in such case automatically.
One more question, how is CBO going to help here, since we want explicitly the skewed keys?
One more question, how is CBO going to help here, since we want explicitly the skewed keys?
We could improve CBO so that it explicitly keeps track of outlier (skewed) values.
If we could know which keys are skewed at planning time broadcast join the skewed keys union partition join the rest is the best solution. It's very hard to know skewed keys though.
We could improve CBO so that it explicitly keeps track of outlier (skewed) values.
We can plug in CBO too but the execution part detailed in this proposal would still be needed. So I plan to start with this execution work, the better ways to provide input to it via CBO could follow
Would it be possible if for your benchmarks you also attached results with BROADCAST join enabled?
updated the results in the doc as well as here.
We can plug in CBO too but the execution part detailed in this proposal would still be needed. So I plan to start with this execution work,
We don't want this code to be dead by any means. @martint @findepi @kokosing Do you have some other ideas (apart from hints and improving CBO) for passing outlier values to this optimization?
Unfortunately we cannot dynamically detect outliers at this point.
We don't want this code to be dead by any means. @martint @findepi @kokosing Do you have some other ideas (apart from hints and improving CBO) for passing outlier values to this optimization?
Maybe we could model this a some session property for now? Like:
SET SESSION skewed_columns = ['catalogA.schemaA.tableA.columnA', 'catalogA.schemaA.tableB.columnB'];
So effectively this is a hint, however then in future we could then try to automatically detect such columns and then remove this flag or leave it. Notice that, it does not require us to introduce SQL syntax "enhancements" to support hints.
Alternatively we could store skew info as column property
@rupamk i think Hive/HMS has some means to "annotate" skewed columns. We could piggy back on this.
Maybe we could model this a some session property for now? Like:
Yes, that's a good way to approach the problem. We do something similar for distributed geospatial joins and grouped execution. We can have the functionality in with a way for users to benefit from it "manually" and then over time make Presto smarter (by improving the optimizer, etc), to do it automatically.
Alternatively we could store skew info as column property
Not every connector support column properties, but all connectors support skewness ;)
Any update on skew join support to the CBO + the hive connector? Would be really helpful for scenarios where you are joining on a column with lots of nulls.
A "manual" approach with a session property (or something) would also be appreciated since having the CBO optimizer support will probably take much longer to implement.
@stagraqubole @vijaymann
I have a PR to implement a scoped down version of this https://github.com/trinodb/trino/pull/13401
Consider following query.
SELECT * from B JOIN A WHERE B.x = A.x
Consider a case, where the Table B data for column x is unevenly distributed and skewed for a specific key. A skewed distribution of keys manifests itself in skewed distribution of workload. As a result, some of the nodes joining the most popular keys will perform heavy computation compared to others. Handling the skewed key becomes a bottleneck for the entire JOIN stage which effectively delays the overall execution of the Hash Join operation.
Problem: How can we evenly distribute the workload for a JOIN (INNER) operation with skewed keys ? ”
The above query can be rewritten to redistribute the workload as:
1.
Drawback: Lots of replication for the non-skewed keys.
For evaluation, two tables are created (Table definition in Appendix 2, https://docs.google.com/document/d/1gmzz8TsuVl9W2VK5sdorUw_sN1W1gWzw_oL4QKgAETM/edit?usp=sharing). Bigger Table has two fields, one of which is skewed on one key, the skewfactor is varied as shown below. The configuration for the cluster is as follows:
Cluster Configs: Master Node Type: m3.2xlarge - 8 cores, 30GiB mem Worker Node Type: m3.2xlarge - 8 cores, 30GiB mem Min/Max Workers: 5 (On Demand Nodes)
Drawback: Appending the new column, there are multiple TableScans for the same table which can introduce latency.
Highlights from the doc: a. Hint framework to get inputs from user (can be extended beyond Skew Join Implementation) b. Explode Operator (used here for replicating the smaller table, can be extended beyond Skew Join Implementation)