cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.18k stars 3.82k forks source link

Question of the Performance on TPC-H Benchmark Query 2 #134803

Open bajinsheng opened 2 weeks ago

bajinsheng commented 2 weeks ago

Describe the problem

For the query 2 in TPC-H benchmark:

select
    s_acctbal,
    s_name,
    n_name,
    p_partkey,
    p_mfgr,
    s_address,
    s_phone,
    s_comment
from
    PART,
    SUPPLIER,
    PARTSUPP,
    NATION,
    REGION
where
    p_partkey = ps_partkey
    and s_suppkey = ps_suppkey
    and p_size = 30
    and p_type like '%STEEL'
    and s_nationkey = n_nationkey
    and n_regionkey = r_regionkey
    and r_name = 'ASIA'
    and ps_supplycost = (
        select
            min(ps_supplycost)
        from
            PARTSUPP,
            SUPPLIER,
            NATION,
            REGION
        where
            p_partkey = ps_partkey
            and s_suppkey = ps_suppkey
            and s_nationkey = n_nationkey
            and n_regionkey = r_regionkey
            and r_name = 'ASIA'
    )
order by
    s_acctbal desc,
    n_name,
    s_name,
    p_partkey
limit
    100;

Its execution time is 11.3s, as shown in its query plan plan_before.txt

I found that disabling the following IF code block brings a significant performance improvement:

diff --git a/pkg/sql/opt/norm/join_funcs.go b/pkg/sql/opt/norm/join_funcs.go
index 67bc2a1ddd6..af31db89ecb 100644
--- a/pkg/sql/opt/norm/join_funcs.go
+++ b/pkg/sql/opt/norm/join_funcs.go
@@ -519,9 +519,6 @@ func (c *CustomFuncs) CanExtractJoinComparison(

        // Disallow cases when one side has a correlated subquery.
        // TODO(radu): investigate relaxing this.
-       if leftProps.HasCorrelatedSubquery || rightProps.HasCorrelatedSubquery {
-               return false
-       }

        if leftProps.OuterCols.Empty() || rightProps.OuterCols.Empty() {
                // It's possible for one side to have no outer cols and still not be a

Its execution time is reduced to 0.471s, as shown in the new query plan plan_after.txt

I wonder whether we can relax this IF condition to enable the second query plan in default, as it is more efficient than the first one.

To Reproduce

cockroach start-single-node --insecure --store=/app/data --listen-addr=0.0.0.0:36257 --sql-addr=0.0.0.0:26257
cockroach workload init tpch

Then execute the above query.

Environment:

Jira issue: CRDB-44256

blathers-crl[bot] commented 2 weeks ago

Hi @bajinsheng, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

blathers-crl[bot] commented 2 weeks ago

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

mw5h commented 6 days ago

Thanks for the report! This is an interesting finding!