pingcap / go-tpc

A toolbox to benchmark TPC workloads in Go
Apache License 2.0
178 stars 94 forks source link

tpch: add query tuning configs for tpch #179

Closed zanmato1984 closed 2 months ago

zanmato1984 commented 2 months ago

Set some session variables known to be effective for tpch before each tpch query to tune performance. And also add flags to turn it on/off and override the default variables.

      --enable-query-tuning         Enable query tuning by setting specified session variables (default true)

tidb_default_string_match_selectivity=0.1: For optimal join order, esp. q9. tidb_opt_join_reorder_threshold=60: For general optimal join order. tidb_prefer_broadcast_join_by_exchange_data_size=ON: For better join type between broadcast join and partition join.

CLAassistant commented 2 months ago

CLA assistant check
All committers have signed the CLA.

Yui-Song commented 2 months ago
zanmato1984 commented 2 months ago
  • Go-tpc already has a global flag to set session variables. It would be better if you could leverage it to set those variables.
--conn-params string    session variables, e.g. for TiDB --conn-params tidb_isolation_read_engines='tiflash', For PostgreSQL: --conn-params sslmode=disable

I didn't realize that there exists such a flag. And yes, we should definitely do so. Thanks for pointing this out. I will update soon.

  • And how about checking whether the user is using TiDB and then deciding to set those session variables or not? The user may get errors if they are testing other MySQL-compatible databases when go-tpc tries to set those tidb dedicated variables.

Is there an existing handy utility that I can use to differentiate tidb from other mysql-compatible databases?

EDIT: And also I found there are already tidb-specific variables being used in, e.g.: https://github.com/pingcap/go-tpc/blob/50e155e9dde0c13ae87833c767811a255eec09f5/tpch/workload.go#L175-L180 Possibly meaning that current "mysql" can only be tidb?

zanmato1984 commented 2 months ago
  • Go-tpc already has a global flag to set session variables. It would be better if you could leverage it to set those variables.
--conn-params string    session variables, e.g. for TiDB --conn-params tidb_isolation_read_engines='tiflash', For PostgreSQL: --conn-params sslmode=disable
  • And how about checking whether the user is using TiDB and then deciding to set those session variables or not? The user may get errors if they are testing other MySQL-compatible databases when go-tpc tries to set those tidb dedicated variables.

I've updated the PR by leveraging existing connection parameters rather than explicitly setting session variables via SQL.

About your second concern, the best I can do so far is to check for "mysql" only because there is no existing way to differentiate tidb from other possible mysql vendors, which is also affecting other tidb-specific logic.

@Yui-Song Could you please take a look? Thanks.