zettadb / kunlun

KunlunBase is a distributed relational database management system(RDBMS) with complete NewSQL capabilities and robust transaction ACID guarantees and is compatible with standard SQL. Applications which used PostgreSQL or MySQL can work with KunlunBase as-is without any code change or rebuild because KunlunBase supports both PostgreSQL and MySQL connection protocols and DML SQL grammars. MySQL DBAs can quickly work on a KunlunBase cluster because we use MySQL as storage nodes of KunlunBase. KunlunBase can elastically scale out as needed, and guarantees transaction ACID under error conditions, and KunlunBase fully passes TPC-C, TPC-H and TPC-DS test suites, so it not only support OLTP workloads but also OLAP workloads. Application developers can use KunlunBase to build IT systems that handles terabytes of data, without any effort on their part to implement data sharding, distributed transaction processing, distributed query processing, crash safety, high availability, strong consistency, horizontal scalability. All these powerful features are provided by KunlunBase. KunlunBase supports powerful and user friendly cluster management, monitor and provision features, can be readily used as DBaaS.
http://www.kunlunbase.com
Apache License 2.0
143 stars 20 forks source link

Comp_node_id not update for new installed comp nodes until restart #667

Open jd-zhang opened 2 years ago

jd-zhang commented 2 years ago

Issue migrated from trac ticket # 503

component: computing nodes | priority: minor

2022-03-16 11:27:37: zhangjindong@zettadb.com created the issue


After the adding a new comp node into the cluster, after a while, checking both comp_node_id variable and pg_cluster_meta, the value is same:

postgres@9019b504d3a8:/kunlun/kunlun-server-0.9.1/scripts$ psql postgres://abc:abc@localhost:5406/postgres                psql (11.5)
Type "help" for help.

postgres=# show comp_node_id;
 comp_node_id
--------------
 6
(1 row)
postgres=# select * from pg_cluster_meta;
 comp_node_id | cluster_id | cluster_master_id | ha_mode | cluster_name | comp_node_name
--------------+------------+-------------------+---------+--------------+----------------
            6 |          1 |                 1 |       1 | clust1       | comp6
(1 row)

But, the information is always:

2022-03-16 10:42:16.032 CST [53306] ERROR:  Kunlun-db: cache lookup failed for cluster meta (pg_cluster_meta) by computing node id(comp_node_id) 7
2022-03-16 10:42:16.032 CST [53306] HINT:  comp_node_id variable must equal to pg_cluster_meta's single row's comp_node_id field.

And when doing operation, it timeouts:

postgres@9019b504d3a8:/kunlun/kunlun-server-0.9.1/scripts$ psql postgres://abc:abc@localhost:5406/postgres
psql (11.5)
Type "help" for help.

postgres=# create table nt1(id int primary key, info text);
ERROR:  canceling statement due to statement timeout

It will return to good state only after a restart of the new comp node.

jd-zhang commented 2 years ago

2022-03-16 15:49:26: smith changed status from assigned to accepted