zettadb / kunlun

KunlunBase is a distributed relational database management system(RDBMS) with complete NewSQL capabilities and robust transaction ACID guarantees and is compatible with standard SQL. Applications which used PostgreSQL or MySQL can work with KunlunBase as-is without any code change or rebuild because KunlunBase supports both PostgreSQL and MySQL connection protocols and DML SQL grammars. MySQL DBAs can quickly work on a KunlunBase cluster because we use MySQL as storage nodes of KunlunBase. KunlunBase can elastically scale out as needed, and guarantees transaction ACID under error conditions, and KunlunBase fully passes TPC-C, TPC-H and TPC-DS test suites, so it not only support OLTP workloads but also OLAP workloads. Application developers can use KunlunBase to build IT systems that handles terabytes of data, without any effort on their part to implement data sharding, distributed transaction processing, distributed query processing, crash safety, high availability, strong consistency, horizontal scalability. All these powerful features are provided by KunlunBase. KunlunBase supports powerful and user friendly cluster management, monitor and provision features, can be readily used as DBaaS.
http://www.kunlunbase.com
Apache License 2.0
143 stars 20 forks source link

根据索引构建参数化远程执行计划 #770

Open jd-zhang opened 2 years ago

jd-zhang commented 2 years ago

Issue migrated from trac ticket # 826

component: computing nodes | priority: major

2022-06-15 16:08:38: smith created the issue


背景: 原来的实现中,不会根据表上的索引为每个表构建参数化的执行计划,导致在执行非常简单的多表JOIN的sql时(例如指定某个表的主键,且连接条件都是索引列的等值连接条件),将表的数据全部加载到计算节点来执行。 实现:

1、根据表的索引以及sql中的等值条件,为每个表构建参数化/非参数化的索引计划;
2、目前暂时没有考虑or条件,后续需要模拟mysql上的index merge的各种策略。
3、开销计算的调整。在计算CN和shard之间每次交互都额外加上一个“惩罚”值,避免数据量太大时仍然使用nestloop join。

如下例子中可以看到,有无索引对于执行开销的影响。

abc=# explain select * from t1 where t1.a=1;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 RemotePlan  (cost=21.27..21.27 rows=10 width=258)
   Shard: 1      Remote SQL: SELECT t1.a,t1.b,t1.c,t1.d FROM  abc_$$_public.t1  WHERE (t1.a = 1)
(2 rows)

abc=# drop index t1_a_idx;
DROP INDEX
abc=# explain select * from t1 where t1.a=1;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 RemotePlan  (cost=2316.96..2316.96 rows=10 width=258)
   Shard: 1      Remote SQL: SELECT t1.a,t1.b,t1.c,t1.d FROM  abc_$$_public.t1  WHERE (t1.a = 1)
(2 rows)
jd-zhang commented 2 years ago

2022-06-15 16:13:14: smith changed owner from david to smith

jd-zhang commented 2 years ago

2022-06-15 16:13:14: smith changed status from new to accepted